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1, Adaptive Systems 
1.1. INTRODUCTION 


Adaptive teaching or training systems have been widely discussed in the 
literature (Pask, 1957 a, b, 1960 a, b, 1962, 1964, 1965 a, b; Pask & Wiseman, 
1959; Pask & Lewis, 1962). As applied in the case of human learning, the 
system has the form suggested by Fig. 1(I) (Fig. 101 is a more detailed 
representation of the same system which will be useful later in the discussion). 
Here, A is a man (the subject or student); B is an adaptive machine; and C 
is a predetermined but unlearnable sequence of variable events, for example, 
of novel problems to be solved in the conduct of a skill. It is assumed that C 
provides nearly sufficient variety to occupy the attention of a proficient 
subject, but that a novice would be crassly overloaded if presented with the 
output from C in an unmodified form. 

Machine B computes a time average of the subject’s behaviour (the pro- 
ficiency measure). This time average is used to guide B in modifying the 
difficulty of the events from C, before they are presented to A. 

If B is an adaptive teaching or training system, the B strategy increases the 
difficulty of the input as A learns and becomes increasingly proficient (we 
may also state this property by saying that B is able to simplify or partially 
solve the problems posed by the C output and that it removes its co-operative 
assistance as it detects an increase in A proficiency). But systems of the sort 
shown in Fig. 1() and Fig. 1(ID are not restricted to teaching applications. 
They have been used for mental testing and for several other purposes as 
well. In these cases, rather different B strategies are appropriate. Hence- 
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forward, a teaching application will be assumed unless the contrary is stated. 

I wish to contend that B controls the learning process in A and to use the 
control paradigm in a more than metaphorical fashion. On it, we may base a 
cybernetic view of teaching (as the control of learning) and later in the dis- 
cussion a cybernetic “null point” or “steady state’ experimental method. 

The crux of the contention is this. As the subject learns, he reduces his 
subjective uncertainty regarding the output from C. Machine B, sensing this 
reduction, compensates for it by increasing the maximum possible objective 
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Fic. 1. A coupled man-machine adaptive-system shown in overview (1) and in detail (I) 
together. with (III) a typical system/behaviour (in terms of proficiency (p) and difficulty 
level (7) as a function of time (¢) or trial number (7)). 


uncertainty presented to the subject, A. It does so by reducing the problem 
simplification or conversely by increasing the problem difficulty. Hence, until 
the variety of C is exhausted (which it will be when the subject has learned to 
deal with all of the possible problems) the coupled system A, B, is a dy- 
namically stable and non-learning system (even though A learns and B adapts 
to compensate for his learning). 
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In so far as it is possible to specify the language, L°, in terms of which A 
interacts with B, the maximum objective uncertainty and the conditional 
response uncertainty of ‘4 may be calculated, and in suitable conditions 
(which we discuss later) A’s “subjective uncertainty” can also be estimated. 
When all of these conditions are satisfied 


(1) the coupled system A, B, is Lyapunov Stable, until the variety of C is 
exhausted by A’s learning (Pask & Mallen, 1971); 

(2) a suitable Lyapunov Function, V, is the measure 1—Redundancy 
(where the redundancy is calculated from the maximum objective 
uncertainty and the conditional response uncertainty). 


(1) has been empirically checked in a few cases. (2) leads us to identify 
A, B, with a self organizing system in the sense of Von Foerster since Von 
Foerster’s (1961) condition, namely 


a Redundancy/dt>0 
is satisfied providing that the Lyapunov condition, namely 


0>dV/dt 
is satisfied. 

On this concept of A, B stability, I propose to found a cybernetic ex- 
perimental method which, roughly speaking, amounts to embedding the 
learning subject A, as a subsystem in a new learning system A, B. Before 
doing so, and for that matter, before showing that there is any need to 
provide a cybernetic experimental method, we should be clear about the type 
of system involved and the types of constancy that are maintained. 

The rest of Section 1 of the paper is devoted to a review of some man— 
machine control systems that are or have been realized in our laboratory. 


1.2. ERROR SCORE CONSTANCY IN A MANUAL TRACKING TASK 

The subject is presented with a compensatory display, usually on a cathode 
ray oscilloscope. In the one-dimensional case shown in Plate 1, he perceives 
the locus of a point on a line and he is required to adjust a manual control in 
order to keep the point within a line segment around a fixed position. His 
manual control determines the acceleration of an idealized vehicle (the 
vehicle characteristics of the system in Plate 1 are specified by a pair of 
integrators). The displayed point (described, to the subject, as an indicator of 
vehicle displacement) varies from its fixed position even if the subject does 
nothing to his control because of an input perturbation that is added to the 
subject’s acceleration control signal before it is integrated. The form of this 
perturbation is unlearnable and hence its value is unpredictable. But the 
subject can learn to “handle the vehicle” when it is perturbed by unpredictable 
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disturbances. Further, it will be empirically safe to assert that the “‘vehicle 
handling’ job is made increasingly difficult by an increase in the mean 
amplitude, 4, of this perturbation. 

The subject’s error score, u, is computed either as the average r.m.s. 
deviation from the fixed point or as the average value of the modulus of this 
deviation. Let u be a criterial error score. To maintain constancy of error 
score, we set: 

n = €f(uo—u)dt 
and choose the positive constant € so that the man-machine system does not 
become unstable due to overcompensation. In other words, the task difficulty 
is continually modulated in order to maintain the chosen form of constancy, 
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that u—ug—0 (or, phrased in a slightly different way, the environment is 
altered in order to maintain a constant relative or subjective difficulty, that is, 
“a difficulty as seen by the subject” and as indicated by his performance). 
Systems of this type (or their multidimensional analogues) have been 
investigated by Hudson (1962) and Kelly (1966) in the United States and, in 
Britain, by Gaines (1968) and by our laboratory (Pask, Tech. Rep.) 


1.3. SYSTEMS FOR MAINTAINING A CONSTANT LEVEL OF VIGILANCE 
In Fig. 2, the subject looks out for an important or relevant event in a 
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background of irrelevant disturbing events. The system used in our laboratory 
presented events that were visual patterns. He is required to respond in a 
predefined correct fashion if, and only if, a relevant event occurs. In par- 
ticular, he must make the correct response if the event occurs, and he must 
make no response if there is no relevant event, that is, he must not hallucinate 
events. We wish to maintain a relation between the subject and his environ- 
ment in which the subject has a constant degree of vigilance. 

But it is well known that the degree of vigilance decreases and a subject’s 
responsive behaviour is impaired if he has been a long while at the job and, in 
particular, if he is fatigued. Further the subject’s degree of vigilance varies 
according to the class of relevant event (as a rule, he may be competent to 
deal with some sorts of event, but incapable of dealing with others). 

In the system of Fig. 2-there are several classes, X,, i = 1, 2,..., m, of 
relevant events, each to be dealt with according to a rule, Q. In other words, 
the desired response, y, given event x, is y = Q (x) where y is selected from a 
set Y and where x is selected from a subset X; of X (the set of all relevant 
events, including the null event). We assume that, unbeknown to the subject, 
it is occasionally possible to delegate his job to another data processing 
system, or to do without his services. These occasional intervals must be 
spaced fairly uniformly in time. They are used for the processing of “test 
signals”’ which, so far as the subject is concerned, are indistinguishable from 
relevant events and we assume that his behaviour with reference to them and 
to reality is identical. 

The constant vigilance control system is most readily described if we assume 
that the control mechanism is able to inject a test signal at any instant al- 
though it aims to minimize the amount of time spent in this fashion. This 
assumption may be relaxed or even renounced in many practical applications 
of the system, for example in inspection and sampling jobs. 

The control mechanism must first of all compute a conditional correct 
response probability estimate for each class of event, such as the frequency 
estimate 
Number of test trials between n and n—t for which xeX,andy = Q(x) 


Number of test trials between n and n—t for which xeX; 





z,(n) = 


so that, as t increases, z,(m) approaches the conditional correct response 
probability pm) = p [y = Q()], xeX,, for t = t, 
The vigilance of the subject at the nth trial (or at the instant ¢,) is estimated 
by 
Vin) = V(t,) = 1/m&z,(n) 
i 


on the assumption that the probability, II,(”), that. xeX; at the mth test 
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trial approaches 1/m (in fact, due to the activity of the control mechanism, 
this assumption is fairly plausible). 

The first part of the control mechanism determines the probability R(¢) 
that some test trial will be made in the interval between t and t+dz¢ where di 
is short. (Jf a test trial is made in this interval and if it is the mth test trial 
then ¢, is between ¢ and ¢+dz.) In the experimental system the control 
mechanism was designed so that 


R(t) =1- Vit,- 1): (1) 


The next part of the control mechanism selects the class of events from 
which one event will be “haphazardly” sampled at ¢ = t¢,, if some test trial 
is made on this occasion. The selection probability for the class X; is denoted 
II,() and in the experimental system the control mechanism was designed 
so that 

II,(n) = 1—z,(n). (2) 


Each of the terms, R(¢,) and II,() is interpreted as a bias applied to a 
probabilistic selection device. 

The effect of equation (1) is to increase the test trial making rate as the 
vigilance of the subject decreases and the effect of equation (2) is to rehearse 
most frequently the most neglected events. (In practice, the latter process 
maintains the average values of the IT,(n) close to 1/m as required in the defini- 
tion of vigilance.) 


1.4. MAINTAINING A CONSTANT AMBIGUITY OF VISUAL PATTERN STIMULI 


When a subject engages in a classification task, he is prone to use familiar 
attributes as the basis for his classification. The experimenter can avoid this 
tendency, by telling the subject in advance that he must use particular 
classifying attributes. In visual classification, for example, the subject may be 
told to classify each member of a sequence of tachistoscopically exposed 
pattern stimuli according to attributes like “size” and “number of distinct 
parts” and “circularity” and so on; these names being assigned, in a mechan- 
ized experiment, to response buttons that are pressed after each tachistoscopic 
exposure. However, this expedient fixes the attribute names and prevents us 
from finding out how the subject chooses new attributes. But equally (to 
reiterate the original statement) if the subject is allowed to name the buttons 
as he desires (he is allowed to select his own attributes), he nearly always 
reverts to familiar (usually geometric) features and, as a result of this, his 
behaviour exhibits problem solving according to familiar methods rather than 
problem posing or the construction of novel attributes, tests or methods. 

Our comments about the freely choosing subject are fairly accurate if the 
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stimuli are unambiguously presented and if the subject is able to use his 
collection of descriptive attributes in a consistent fashion. If the presentation 
is ambiguous, the subject may be forced to relinquish his original choice of 
attributes (because he cannot use them consistently) and to select new ones. 
In fact, this reorientation does occur to some extent when ambiguity is 
introduced (conveniently by reducing the tachistoscopic exposure interval). 
But there is a delicate balance between the tendency to search for new con- 
sistently applicable attributes and a tendency to give up altogether, in the 
belief that it is impossible to make sense of the environment. 

Some years ago (Pask, 1964c), we attempted to induce problem posing 
behaviour by regulating the level of stimulus ambiguity, using visual stimuli 
and the equipment in Plate 2. The stimuli were arranged in blocks or sub- 
sequences of about 50 items and were all of the same type; chequerboard 
patterns. After dealing with a block of stimuli by classifying them, the subject 
is allowed to rename the classifying response buttons (to redefine his at- 
tributes). The experimental data consists in the list of the attribute names that 
are recorded after each block or subsequence of stimuli. 

Within a particular block of trials, the subject assigns names to a maximum 
of eight response buttons; he may use less than eight if he wishes, providing 
that his categorization is informative and also self consistent. Before the 
experiment we describe, rather carefully, what is and is not informative and 
self consistent; and if the subject has any difficulties, these are discussed. 

An assignment of attribute names is informative if the subject can use the 
named attributes to divide the stimuli into coherent subsets. An attribute 
like “being a chequerboard pattern” is not informative because the subject 
knows that all of the patterns belong to the class it intentionally defines. 
Nor is an attribute informative if it can never assume a positive value. 
Ideally, an attribute should dichotomize the universe of patterns into one 
class for which it assumes a positive value and another, disjoint, class for 
which it assumes a negative value. But this ideal can rarely be achieved and 
we did not insist upon a very close approximation. 

So far as self consistency is concerned, the subject is regarded as self 
consistent in his use of an attribute if, when presented with the same stimulus 
upon several occasions, he assigns the same value to this attribute. In fact, 
stimulus repetition (and comparison of the initial and the subsequent values 
of each of the attributes) is the device employed to measure the subject’s 
degree of self consistency. But (on most of the occasions at any rate) the 
subject is unaware that an identical stimulus is repeated. From his point of 
view self consistency is a matter of using his own selection of visual cues in a 
consistent fashion. 


286 G. PASK 


In the system of Plate 2, however, the measure of self consistency is used as a 
feedback variable. As the degree of self consistency increases, the tachisto- 
scopic exposure interval is reduced and (with it) the stimulus ambiguity is 
increased. The adjusting coefficient in this feedback loop is empirically 
determined. 

Using this arrangement, the level of self consistency is a parameter of the 
system (its value is maintained constant by an adjustment of stimulus 
ambiguity). As the value of this parameter is decreased and the average level 
of stimulus pattern ambiguity increased, it becomes more and more difficult 
for the subject to stick to a predetermined set of attributes. At the levels used 
in a preliminary experiment (Pask, 1964c) it was possible to dichotomize 
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Fic. 3. Co-ordinate transformation task (a) Typical behaviour showing initial interference 
between component I (Row) and component I (Column). (b) Display and response facilities. 


our subjects into a group able to reselect attributes; the “problem posing” 
group; and a group who were unable to do this. Since undiluted “problem 
solving’’ activity was rendered impossible, subjects in the latter group found 
the experimental situation untenable. 
1.5. MAINTAINING CONSTANT PROFICIENCY IN A CO-ORDINATE TRANS- 
FORMATION SKILL 

The subject is presented with the display and response board shown in 

Fig. 3. Each trial, or each At sec, he is presented with a figure in the alphabetic 
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Plate 3. Trajectory task used to exemplify method for determining relative difficulty of 
subskills and point of psychological equivalence. 
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Plate 2. Constant ambiguity systems. (a) Control equipment and projector. (b) Subjects’ 
display and response board. Upper Row: buttons for assigning attributes as being relevant 
to classification or not. Middle row: attributes switches. Lower row: cancel buttons. 
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Plate 1. Adaptive system for teaching skill, (a) subject’s console and (b) control equipment. 
(For unidimensional tracking only one channel is employed. Other channels are used for 
further dimensions and interference tasks.) 
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Plate 6. A general adaptive metasystem. Originally used to control a group learning situation, 

this equipment also provided an individual subject with facilities for purchasing tests, 

evidence and response format in a concept acquisition task; his purchases being limited by 
the current value of an adaptive bank balance. 
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display. He is required to select the row and column buttons that give the 
row and column co-ordinates of this figure in the 3.4 rectangular display 
before At sec after the stimulus. With At set at 2°5 sec and the between-trial 
interval at 1-5 sec this is a difficult job, and the novice is unable to do it, 
unless he is also provided with cueing information. 

The skill is characterized by a couple of error factors (in the sense of Har- 
low, 1959) namely a row and column error factor, since the subject’s response 
may be row correct and column correct independently. Hence, the cueing in- 
formation must be delivered with reference to the row response selection and 
the column response selection separately. The information concerned is 
provided by the row lamps and the column lamps of Fig. 3, which are illu- 
minated at a variable interval (within the allowed interval of At sec) after the 
appearance of a stimulus figure in the alphabetic display. 

Let i = “Column” or “Row”, and let p; be a proficiency measure of the 
form 

p; = Average value (;) 


where 
+1 if the ith type response selection is correct and presented before 
the ith type cueing information; 
gee 0 if the i type response selection is correct but too late; 


~1 if the i type response selection is mistaken or absent (the value 
of +1 in this rule may, with advantage, be replaced by the term 
At-latency). 


Next, we define 7; as the proportion of Ar that elapses before the ith type 
of cueing information is delivered to the subject. The maximum value of 
n,is At and if 4; = max then no cueing information of the ith type is delivered. 

The control mechanism for variable delay cueing satisfies 


4, = E[(pi— Po)dt 


where (o is the required level of proficiency and é is the positive rate term. 
The system is started in the initial condition 4,(0) = 0 and the value of n; is 
well defined for nna. >My 

The assumption underlying simplification by a variable delay cueing 
procedure, with Af constant, is that the subject solves problems at the same 
rate throughout the learning process. Given this, the cueing information can 
be delivered at a crucial (or optimal or constancy maintaining) instant during 
the decision process. However, the constant rate assumption is (at the most) 
a crude approximation and the serious use of variable delay cueing rests upon 
some sort of “clamping” technique; a technique whereby the trial duration 
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is “clamped” to the possibly variable interval occupied by the decision 
process. Ee 

We have recently developed a clamping technique, and we are currently 
applying it to a study of the way in which subjects direct their attention to 
different cues both in the laboratory and in a real life training situation. 

The clamping technique involves a variation of At, which we now write 
as At(n). We wish to find the expected value of the interval that will be 
occupied by problem solving if the subject responds correctly at the moth 
trial. Let the estimated interval be t(7 9) where 


t(Mo) = Mean value for ng >n of At(n)— Latency at the nth trial, if a correct 
response is made. 


To complete the feedback loop we now set 
t(no) = t(%o) +6 
where 6 is positive. 
Recalling that 4; is defined proportionally, the cueing delay becomes a 
proportion of At(m), rather than a proportion of At. 


1.6. Set ata PSYCHOLOGICAL EQUIVALENCE BETWEEN PARTS OF A 

The subject is provided with the display and response facilities of Plate 3. 
Each cell in the array contains a couple of lamps; one of these is used to 
provide a visual “background noise” signal and the other to indicate a 
relevant signal sequence or trajectory. The visual “background noise”’ is 
obtained by sequentially switching groups of eight or ten lamps at once 
every } sec, the switching being haphazard; the trajectory is imposed upon a 
variable amplitude of background illumination. There are eight possible 
types of trajectory, each being indicated by a sequence of four illuminated 
cells from left to right across the display. The subject is assisted in discerning 
where a relevant signal is located by a lower lamp which indicates the pair of 
columns within which the trajectory signal is located, if it exists. Each trajec- 
tory passes through the bounded region in the display and the subject is 
required to intercept each trajectory by placing an interceptor in a suitable 
cell in this bounded region (which he does by pressing one of the buttons). 

Subjects find that different trajectories present different degrees of difficulty 
and the distribution of difficulties is rather idiosyncratic. Suppose, in this case, 
that we wish to equate the parts of the combined interception skill that are 
concerned with the interception of different types of trajectory, it will be 
necessary to modulate the potential difficulty of each of these parts. This can 
be done by varying the amplitude of the background illumination separately 
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for each type in such a way that the subject achieves a constant proficiency in 
connection with each sort of interception. 
Let y; represent the background illumination amplitude and let p; 
be the subject’s proficiency with reference to the ith type of trajectory. 
A rule such as: 


An; = +1 if p;> po unless 4; = Mma, When An; = 0 
= Oif p; = Po 
= —1 if po>p; unless n; = 0 when Ay; = 0 


will maintain a steady-state value of 7; such that p;— po (if the subject learns, 
this steady-state value changes slowly to compensate for his learning). 

If the values of the 7; are different, then we may argue that a type | tra- 
jectory in conjunction with the background amplitude 7, is psychologically 
equivalent to a type 2 trajectory in conjunction with the background amplitude 
no (the value of 7, is-greater than the value of 7, if a type 2 trajectory is more 
difficult to deal with than a type 1 trajectory). 

It is evident, on looking at the recordings from a system of this sort, that 
there is a great deal of interaction between the performance of each part of 
the skill and usually the several subskills interfere with one another. It appears 
impossible to achieve a completely accurate performance if there are eight 
types of subskill and if the trajectory is presented within an interval of 5 
sec with a 2-sec rest interval between trials. 


1.7, MAINTAINING CONSTANCY OF PERFORMANCE IN PERCEPTUAL 
DISCRIMINATION EXPERIMENTS (LEWIS, PASK & WATTS, 1964) 

In Plate 4, the subject is presented, at each trial, with a pair of figures, one in 
display A and one in display B. These are back illuminated and haphazardly 
oriented dot patterns. The total number of dots in each pattern is 18 (in all 
of the experiments so far performed) and the difference between the number 
in A and the number in B is +5, +3, or +1. 

To maintain a constancy of performance, we use a control mechanism 
that modulates a couple of variables, namely: 


(i) whether the difference is + or — and 
(ii) given the sense of the difference, its value of 5, 3 or 1. 


The subject is required to assert, by pressing response buttons, whether the 
difference is in favour of display A (having more illuminated dots) or in 
favour of display B, and he is required to make an assertion after each presen- 
tation of an A, B, pair. He may or may not, depending upon the conditions 
in the experiment, receive information about whether or not his assertion is 
realistic. 
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To satisfy (i) the probability of selecting an ‘‘A greatest”’ stimulus, which is 
1—the probability of selecting a ““B greatest” stimulus, is made proportional 
to the balance of correct “B greatest’’ to correct “A greatest” assertions. To 
satisfy (ii) the actual number of illuminated dots is modified separately for 
each sense of the difference, so that the subject’s proficiency is equal in each 
case and so that it approaches a predetermined value of about 0:8. 


1.8. MAINTAINING A CONSTANCY OF PERCEPTUAL AND COGNITIVE 
LOADING IN A TRANSFORMATION TASK (LEWIS & PASK, 1965; PASK, 
1965; PASK & LEWIS, 1971) 

In a transformation task, the subject learns the skill of applying Q, a 
coding or transformation rule that defines some one to one correspondence 
between the row A and the row B lamps, in Plate 5. The stimuli, x, consist in 
groups of up to four lamps, selected for illumination from row A. The subject 
is supposed to specify the transformation of x by pressing suitable response 
buttons, C (in one to one correspondence with the lamps in row B), in order 
to define a response construction, y. If y = Q (x), he receives an affirmative 
knowledge of results signal, D, and if not, a negative knowledge of results 
signal. The response construction must be made within an interval of Az 
after the stimulus is displayed (and it is evident that the stimulus only poses a 
problem to be solved because of this restriction upon the subject’s behaviour). 
For At between 4-5 sec, this is a fairly severe restriction and the novice is 
unable to perform the skill. He is overloaded and the problems appear to 
him to be unintelligible. 

In order to initiate learning it is necessary to simplify the stimuli, and this is 
done by reducing the perceptual and cognitive load through a reduction in 
the number of illuminated lamps in a stimulus configuration from four to 
three, to two to one, at which level any subject, even the complete novice, 
can perform the task. Depending upon the level of difficulty (conversely 
simplification) stimuli are selected from one or another of the rows in 
Fig. 4. : 

Using decision rules of the sort we have already considered, it is easy 
enough to identify 7 with the row numbers in Fig. 4 and to determine 4 as a 
function of p so that p> po. For a single transformation rule or mapping, Q, 
there is, of course, a strict limit to the interval for which the condition can be 
maintained. When the subject has learned to apply Q (this is entirely different 
from knowing Q, which he does all along) the adaptive machine is no longer 
able to control the system by adjusting the stimulus difficulty. This stable 
interval is greatly extended in the case of a task that entails the alternation of 
transformation rules, say Q, and Q, between which there is appreciable 
interference. 
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Fic. 4, Stimuli are generated, as patterns, by assigning values to four signal variables A, B, 
C, D (2 valued in the present account, 4 valued in other experiments). 7 alters the number 
of variables treated as stimulus co-ordinates. There are 15 pattern subsets Uu, Us..., 
Usp, Uac,...3 Uanc, Uazn,...3 Uascp. For 2 valued variables there are 16 elements in 
Uascp; But the generating set contains only 8 stimuli, x, so chosen that if further stimuli, xeX 
are obtained by deleting one or more co-ordinates all subsets U with the same number of 
indices contain the same number of stimuli (8 in Usscp or any Of Ugsc, Uasp, ..-; 4in any 
of Uys, Usc,...). If X is the union of all subsets, U, then X < X. Similarly, there are 
four response variables A*, B*, C*, D* for which there are 15 index subsets V. A complex 
response, y, is a member of Y = Union of all V. The correct mapping Q; X-Y is induced by 
a one to one correspondence 2*; (A+ BxCxD)—(A* x B* x C*x D*). A response is 
correct if and only if (x, y)eQ; xe X, ye Y. 
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Fic. 5. An adaptive system incorporating a controller (M+, M°). For any fixed assignment 

of the index i, M° is identical with B in Fig. 1(1). The parametric arrow notation implies 

that when M‘ changes the value of i, all indexed entities may be altered (namely p:, poi, 
nt, Q:, E;) or only some of them. 
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1.9. RULE ALTERNATION 

Using the device in Fig. 5, it is perfectly possible to instruct a skill con- 
sisting in the ability to apply either 2, or Q, (as required by an instruction 
naming the value of i = 1 or 2) when the pertinence of these rules is alternated. 
The control mechanism contains two levels, L' and L°, and two separable 
machines M* and M°. Of these, M° is identical with the control mechanism 
in Fig. 10D) apart from the fact that it instructs a pair of subskills (one of them 
concerned with the application of Q., and one with the application of Q,). The 
higher level machine, M', specifies an alternation strategy for the rehearsal of 
Subskill 1 and Subskill 2. It does so on the basis of the values of 7, and 72 
which are determined by the activity of M°. 

Strictly, M* should be designed to accept and maximize an estimate of 


M1 «M2 [dt 
since this represents a learning rate. In practice, it is possible to show that 
the chief impediment to learning is the subject’s tendency to rehearse one 
subskill at once instead of learning a concept that allows him to view the skill 
as a whole. Because of this, it is sufficient, for this particular learning situa- 
tion, if M* aims, as a hill-climbing mechanism, to maximize the simple 
product 4, : 1. 


1.10. PARAMETRIC ADJUSTMENT 


The system described in Section 1.9 is a special case of control by an hier- 
archically organized adaptive control mechanism (with levels of control L°, 
L', ...). The parameter adjusted by M' need not be a term in a rule alterna- 
tion strategy. It may, for example, be the rate term in a simple control system 
(such as € in 1.2) or the value of po. Again, the output of M* may operate 
upon the type and amount of cueing information or the number of alternative 
subskills (an arrangement of this sort has been used in connection with the 
system described in 1.6 to vary the number of classes of trajectory from which 
one trajectory is selected for presentation at the mth trial). 


2. Learning Models 
2.1. INTRODUCTION 


In this part of the paper, we discuss the role and status of models, in 
particular of models for the learning process. We shall review a number of 
philosophical notions and distinguish between two different types of model 
(the “functional” and the “programmatic” models of 2.3). These types of 
model and their utility are discussed separately (in 2.4 and in 2.5). We argue 
that, so far as man is concerned, the programmatic model has definite 
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advantages and in 2.6, 2.7 and 2.8, we shall develop a special class of pro- 
grammatic models. 

As a preliminary, let us recall a distinction made by Cherry (1957) between 
a descriptive metalanguage and an object language. Models are descriptive 
and predictive structures that are themselves described by an observer in a 
descriptive metalanguage, for example, to use a case cited by Cherry, we 
describe an information channel; its transmitter, receiver, alphabet, code and 
transition rules; all in terms of a descriptive metalanguage. In much the same 
way, the model is related to or identified with the real system that it represents, 
as an analogue, in terms of a descriptive metalanguage. The object language, 
on the other hand, is the language used for communication within the model 
(and, by analogy, the language used for communication between whatever 
real systems correspond to the transmitter and the receiver). 

So far as functional models are concerned, this structure seems obvious 
and its explicit statement a little pedantic. (For here, the object language, in 
so far as we credit a collection of distinguished signal states with such a 
name, is chosen by the observer alone.) So far as programmatic models are 
concerned, the notion of an object language and a metalanguage is absolutely 
vital (1) because the whole model hangs upon the existence of a cogent symbol 
system that, very plainly, serves as an object language and (2) because the 
observer is not always free to choose or to limit the usage of this object 
language. A great deal of confusion can be avoided in this and the later parts 
of the paper if we keep the distinction between descriptive metalanguages and 
object languages very firmly in mind. 


2.2. PHILOSOPHICAL DISCUSSION 


Any measurement whatever is made with respect to a model which is 
analogous to the real system under observation. Even in the simplest sort of 
psychological experiment, the experimenter presents stimuli and observes 
responses. These categories of events (“stimuli and “‘responses’’) are not 
given as immutables by nature. They are invented by the experimenter him- 
self and form part of his model for the observed subject (in this very simple 
case, the “input” states and the “output” states of the model). It is perfectly 
true that the experimenter need not make any predictions about the result of 
stimulating the subject and it may be the case that his model is insufficiently 
structured to allow any coherent prediction. Nevertheless, it exists in the 
experimental situation and may be conceived as lying in parallel with the real 
system or subject as suggested in Fig. 6. It is, as we proposed a moment ago, 
part of the definition of the entities “stimulus” and “response”’. 

Hence, in order to make consistent measurements, it is necessary to main- 
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tain the analogy relation between the model and the real system constant 
valued (or, at any rate, to maintain this constancy so far as the measured 
state variables of the model are concerned). This constancy of relation 
condition will recur in the argument and will be abbreviated to C.R. The 
analogy between states of the real system (or subject) and the states of the 
model is often called an identification. Here, “identification” means that 
certain relevant properties of the system are placed in correspondence with 
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Fic. 6. Identification process. According to the CEP, the observer chooses subject’s experi- 

mental environment so that it can be conveniently modelled; chooses model’s environment 

as identical and builds structure of model to replicate real behaviour. The alternative 
paradigm for identification is shown in Fig. 20. 


the state variables and parameters of the model (for example, by specifying 
measurement procedures and experimental conditions). Given a model and 
its identification and being assured of C.R. the “state of the system’’ is no 
more nor less than the state of the model (the conjoint evaluation of its state 
variables). 

In fact, there are several ways of maintaining C.R. which is an agreed 
prerequisite for measurement. One of them was mooted in the earlier part of 
this paper, and we shall return to it. But the most familiar method of main- 
taining C.R. is the “classical experimental paradigm’ or C.E.P. The C.E.P. 
has the virtue of simplicity and was developed to a great nicety in the classical 
sciences. It is achieved by preserving a constant identification between the 
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model and the real system; usually through a rigid control of the environ- 
mental variables and an attempt to keep the internal parameters of the system 
constant valued. 

Because of its elegance and simplicity, the C.E.P. was imported directly 
into psychology where, in some fields, it is entirely adequate. In other parts of 
psychology, however, it is doubtful whether the C.E.P. is usefully applicable. 
In particular, the character of the phenomenon “learning” impedes the ap- 
plication of the C.E.P. when we aim to measure manifestations of the 
phenomenon in a learning experiment. 

It is evident, for example, that C.E.P. must be modified to allow for the 
variation of some internal parameters; for unless some internal parameters 
change in value, no learning can take place. This does not vitiate the applica- 
tion of a modified form of C.E.P.; indeed, we shall argue that a modified 
C.E.P. is appropriate when the model for learning is a functional model (as 
defined below) and when the learning process amounts to goal directed 
adaptation. On the other hand, if the experimenter uses an active program- 
matic model (again, as defined below) or if the learning process being in- 
vestigated by the experimenter has the calibre of concept learning, then the 
C.E.P. is utterly inapplicable. We also venture the comment that C.E.P. is 
rather likely to prove inapplicable in connection with human learning other 
than the human conditioning manifest in physiologically oriented experiments. 
When the C.E.P. is inapplicable it is necessary to maintain C.R. by another 
method. 

To phrase this argument in less intuitive terms it will be necessary to give 
some attention to the issue of models and the analogy relations in which they 
are involved. We shall also review the attempts that are made to achieve 
C.E.P. in psychological learning experiments (in other words, the methods 
used to preserve a constant identification) and will comment upon their 
inadequacy. Finally, we shall exhibit one case in which C.E.P. is obviously 
inapplicable and will attempt to detail a different and dynamic method for 
maintaining C.R. 


2.3. MODEL TYPES 

Any model represents an organization. Although the model is perfectly 
respectable in abstraction it must, for the present purpose, be capable of 
physical embodiment either in a special artifact or as a computer simulation. 
We shall often refer to models dually (as “working models” on the one hand 
and as “abstract organizations” on the other), but this should cause no con- 
fusion. 

It will be necessary to consider two sorts of model, “Functional Models” 


B 
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and “Programmatic Models’. They differ in respect to the interpretation 
placed upon the building blocks from which they are constructed and the 
domains in which they operate. 

Functional models represent mechanical or biological organizations in real 
systems (such as brains) that perform computations. The functional building 
blocks are tokens for processes and may usually be replaced by many func- 
tionally equivalent types of computing machinery. In the model these pro- 
cesses are causally related. Functional models describe the subject or his 
brain as viewed, from the outside in, by a psychologically oriented engineer. 
Like any other engineering image they are described in whatever terms the 
model builder finds convenient, lucid and manipulable. The choice of nota- 
tion is made at the model builder’s discretion. 

Because of these characteristics, the model cannot be credited with the 
“interpretation” of signals or stimuli. Rather they “act upon it” in a deter- 
ministic or probabilistic fashion. When we talk of the probability of an event 
in the model, this is a model builder’s parameter and there is no sense in 
which we can introduce the concept of a “subjective probability” (or of a 
probability or information or uncertainty values ‘‘as seen by the model’’). 
The majority of functional models are control systems which aim for a goal. 
But, once again, the characteristics of the model only permit us to interpret a 
goal in a causal fashion, as the state which leads to equilibrium in some higher 
level control system such as a drive mechanism. 

Programmatic or “algorithmic”? models are cognitive organizations. They 
are described in terms of a symbol system that is cogent for the modelled 
organism and which the model builder is not entirely free to choose. In terms 
of this symbol system the real organism is able to interpret states of itself and 
its environment as states of knowing; typically as problems that demand a 
particular type of solution. It is often possible to change the interpretation 
code of the model (which makes it compute like a particular cognitive process) 
by instructions that tally with the instructions given. to a subject. The basic 
building blocks of the model are cognitive operations and they are related in a 
programme; hence, the name “programmatic” model. 

Like functional models, programmatic models are usually control systems 
(since problem solving is no more nor less than a mode of symbolic control) 
and it may be convenient to regard the cognitive operations as TOTE 
(Miller et al., 1960) units. However, the goal of a programmatic model is 
specified in terms of consonance between higher level descriptions in the 
cogent symbol system and not in the mechanical fashion we cited a moment 
ago. Similarly, within a programmatic model it is perfectly reasonable to 
talk about the “subjective probability” of an event (providing we ascribe it 
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the rather prosaic connotation of “the probability of the event as seen by the 
model’’). : 

As a final distinction between functional and programmatic models, we 
comment that a programmatic model is an incomplete representation of the 
subject as an individual. It has no mechanical embodiment though it may, of 
course, be embodied. It is, after all, a computer programme without a 
computer. So, to complete the model, it is necessary to embody it with those 
limitations upon its embodiment that would have been encountered if the 
programme had been run on that specifically constrained biological computer, 
the brain. When the programmatic model is simulated on a digital computer, 
these specific constraints may be written (and we shall assume that they are 
written) into an auxiliary “Resource Allocation” programme. 

The combination of a programmatic model with a resource allocation 
programme or its mechanical equivalent is important enough to demand a 
special name. I shall call this combination an “organization model” (an 
“organization model” is an active programmatic model). 

There is no argument that compels us to use one sort of model or the other. 
After all, functional and programmatic models are different ways of looking 
at the same thing, the subject. Ultimately, these models are (in principle) 
dual and probably equivalent representations. For the moment, a program- 
matic model will be used if the subject is regarded as a language processor 
and a functional model if he is regarded as a bit of engineering. 

In practice, of course, one point of view may be far more useful than the 
other and it is conceivable that one may be forced upon us by way that the 
model is identified (as we argue later, there is a great deal of difference between 
the identification of a programmatic model and a functional model). On 
pragmatic grounds, I believe we can make a very good case for viewing man 
as a language processor when he engages in a learning experiment; he learns 
to solve problems; he accepts symbolic goals and instructions. Further, the 
identification of a programmatic learning model is possible whereas the 
identification of a functional model is not, unless we confine our attention to 
extremely narrow forms of learning.t Thus, although we shall examine 
the merits and the limitations of functional models, there is, in the back- 
ground, a perhaps arbitrary choice in favour of a programmatic model for 
man. 


{ I have argued, in other publications (1971) that all learning in the specific sensory 
motor systems of an organism is properly reducible to control units and that these are 
properly conceived as operating upon a linguistic domain. This argument is at the back of 
my mind. But it contains several contentious points and the argument will not be used 
explicitly in this paper. 
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2.4. FUNCTIONAL MODELS 


A malleable or plastic network (Widrow, 1962a, b; Taylor, 1962, 1963; 
Harmon, 1959; Beurle, 1954, 1959; Rosenblatt, 1961; Willis, 1959; Pask, 
1963c) is a special but important case of a functional model. So are the 
conditional probability learning machines constructed by Uttley (1956), 
Steinbuch (1961), Marron (1962) and Pask (1961). This type of model is 
often suggested by anatomical or physiological data and itis a well-established 
vehicle for neurophysiological hypotheses. But the physiological interpreta- 
tion is in no way necessary. Broadbent’s (1957) filter model and Crossman’s 
(1960) model for perceptual motor performance and Deutsch’s (1960) model 
for learning exemplify functional models that carry little or no neurophy- 
siological commitment. Similarly, the elements in a malleable or plastic 
network need not be neuronally distinct entities. 

What we do have in a functional model is a set of structural constraints 
such as the network structure and a set of functional specifications for 
components such as the elements in the network and a definition and limita- 
tion of internal parameters such as the “weights” in the network that are 
changed in a positive or facilitating sense if, and only if, there is a con- 
solidating signal from the internal drive mechanism and which decay in the 
absence of such a signal. The goal directed adaptation of the network model 
must leave the network structure invariant and the elements unchanged; the 
adaptation is wholly a matter of “‘changing weight values” in a fashion that 
depends crucially upon the activation of a distinguished binary input to the 
drive mechanism called the reinforcing variable. 

As it stands, the model is capable of representing adaptation (and if 
realized as a physical artifact it is purely an adaptive machine). It becomes 
capable of goal directed adaptation (and thus, according to most canons, of 
learning) in so far as there is an internal goal embodied in the drive mechanism 
or, more usually, insofar as a control loop is completed in the experimental 
environment by a device including a payoff function that determines what is 
and is not a reinforced “‘stimulus and response state”’. 

Models like this have implicit and important limitations upon their 
capacity or upon the rate at which they can deal with variations in their 
environment. The maximum limit is determined by the possible excitations 
of the elements in the model and by delays in computation and the transfer 
of signals. Further, variants of this model that have a built-in trial making 
facility, also operate at a minimum rate, for they are constrained to perturb 
their environment. Systems of this sort have been considered by Andrew 
(1969). We comment that these limits are neither arbitrary nor irrelevant and 
will return to discuss the point later. 
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If all the internal detail is omitted from a functional model, it becomes a 
Black Box model, in the sense of Ashby (1965). The paradigm is Fig. 7 
wherein X is a generic input variable indexing the possible stimulations of the 
model and Y is a generic output variable indexing the possible response 
selections of the model. The states of the model are pairs, x, y. If we retain 
sufficient detail to distinguish some component of the input as a reinforce- 
ment signal, then this paradigm is converted into the adaptive filter of 
Fig. 8. If we adjoin to this structure an internal drive mechanism modulated 
by the reinforcement signal J, we obtain the canonical form for a learning 


x—+[Masmag} —-v 


Fic. 7. ‘Black Box”? model. 





Fic. 8. Filter model. Parametric arrow, though unchanged, represents specification of 

structure. If regarded as finite automaton, the structure consists in (irredundant) internal 

states s in S. But these can be expressed, externally, via behavioural histories ((x1, 1)... 
(ny Yn) )- 


0 
x— Y 
l] 
Fic. 9. Adaptive model controlled by reinforcement J. The internal comparator notation 


implies that the upper Box receives a description of the lower Box and is a concise statement 
for class of connections shown in Fig. 5. 


control model (Fig. 9) which is isomorphic with Ashby’s ultrastable system 
(1965), the reinforcing control loop that secures goal directed adaptation 
is completed (Fig. 10) through a payoff function in the experimental 
environment.t These forms of Black Box are minimal models for learning 

t The ultrastable system is a goal directed adaptive system in so far as it is stable if, 


and only if, it satisfies the goal built into its specification. For the present purpose we need 
not insist upon the special goal of survival that is introduced in Ashby’s original argument. 


300 G. PASK 


and when measurements are said to be made “without a mode!” this means 
that the minimal model, or something approaching it, is adopted. Thus the 
behaviouristic paradigm for instrumental conditioning involves an ultra- 
stable system and the idea of “operant behaviour’ entails, in addition, some 
trial making process of the sort considered by Andrew (1959). 





Fic. 10. Goal directed adaptation. Goal is given by payoff function ®, As special case, ® 
may be 2. 














Fic. 11. Learning model. The concept of learning entails the idea of two independent 

systems that aim for the same goal (and which may be construed as student/teacher or 

subject/control mechanism). The construction is still applicable if the parallel connection 
(shown) is replaced by a co-operative or series connection as in Fig. 5. 


So far, the modelled learning process has been essentially passive. This is 
true, even for those models that include a trial making component that 
perturbs the environment (here, the model must respond but it is not strictly 
constrained so that it must Jearn), Perhaps the least elaborate form of 
active and functional learning model is the structure in Fig. 11. The stability 
criterion for the model has been adjusted so that the ultrastable system is 
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dynamically stable if, and only if, it can engage in a given rate of goal 
directed adaptation. Such a device is not difficult to build (as a machine) (Pask, 
1969), but it has the interesting property that its dynamic stability depends 
upon an environment in which it is able to exhibit something akin to 
“curiosity” or alternatively upon the incorporation of an internal “attention” 
mechanism. 


2.5. PROGRAMMATIC MODELS 


(1) Form. A programmatic model (Fig. 12) is based upon a symbol system 
that is compatible with the language used for communication between 
members of the class of organisms represented by the model (this is the 
object language of 2.1, where we stressed the distinction between the 
experimenter’s descriptive metalanguage and the object language). This 
symbol system is used for the interpretation of stimuli as denoting problems 
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Fic. 12. A programmatic model. 


(or partially solved or simplified problems) according to a specified interpreta- 
tive code; it is used for describing the symbolic environment (of problems, 
solutions and goal conditions) and the internal state of the model; finally it is 
used for prescribing actions which may either be carried out in the environ- 
ment as response selections (operations in the L° Box of Fig. 12) or carried 
out internally to effect the structural modifications which become evident as 
learning (operations in the L’ Box of Fig. 12). Like any other logically 
adequate learning model (for example, the adaptive ultrastable system) the 
programmatic model is hierarchically organized and the levels, L' and L° in 
Fig. 12 are interpreted as levels of control (in the sense of Mesarovic (1962, 
1963), Tarjan (1963) and others). In Fig. 12, the control is “symbolic control” 
which is a synonym for the easier phrase “problem solving’’. Hence, the 
concept of “learning” is reducible, within this model, to an hierarchical 
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organization of “problem solving” wherein the L° problem solvers operate 
upon a domain of problems posed by the environment whereas the L* 
problem solvers operate upon a domain of L° problem solvers, usually to 
repair deficiencies in the existing repertoire of L° problem solvers. 

All of the artificial intelligence models due to Amarel (1963), Banerjee 
(1963), Fiegenbaum (1961), Hovland (1961), Hunt (1963), Kochen (1962), 
Minsky (1961), Newell (1962), Newell et al. (1961), Reichmann (1966) and 
others have this calibre and the symbol system is conceived in harmony with 
human discourse and human understanding. 

As we commented in 2.2, the programmatic model prescribes an organiza- 
tion and is an inherently passive symbol manipulator that transforms and solves 
the problems presented to it. If the model is to represent an individual subject, 
it must be associated with and “‘run” in some sort of computing machine 
whereby it becomes an active model (the distinction between the programme 
and the “running” programme is essentially the same as Chomsky’s (1965) 
distinction between the organization of a linguistic structure and a speaker of 
the language). 

In fact, if the programmatic model is embodied in a digital computer and 
used to simulate a class of learning processes, no particular mechanical 
limits are imposed (for a computer is designed to minimize the interaction 
between the programme and the machine). But the position is entirely differ- 
ent when the programmatic model is embodied in the biological computing 
mechanism of a brain. If we use the model to represent an individual subject 
learning to solve problems, the special restrictions imposed by embodiment in 
a brain must also be modelled and simulated on a digital computer. As in 2.2, 
these restrictions will appear in an auxiliary “resource allocation’ programme 
that is separable from the “programmatic model” programme that represents 
mentation. The combination of the programmatic and the resource allocation 
programme yields an organization model in the sense of 2.3. 


(II) L°® Structure. The L' structure of a programmatic model contains an 
interpretative code for states of the model, or, in the sense of 2.3, “states of 
knowing”. Specifically, this code specifies an L° goal, it prescribes L° opera- 
tions (that will satisfy this goal) upon states of knowing and it describes 
either states of knowing or legal sequences of operations. But prescription and 
description alone are insufficient. In order to do anything (by way of respond- 
ing or dealing with stimuli presented to it) the L° structure of the model must 
be filled out by a set of strings of embodied operations (or physical operators) 
which have been prescribed and which may be described. We shall refer to 
this set of embodied L° operators as J(n), standing for “L° internal model at 
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the nth trial’, and we shall conceive one sort of learning as the construction 
and modification of J(™). 

The programme flow charted in Fig. 13 is the L° structure of a model for 
a subject performing the transformation task of Section 1.8. (Plate 5 and 
Fig. 4). 

If the stimulus denotes a state of knowing to which a string of operators in 
I(n) can be applied (to yield a further state of knowing demarcated as a 
solution) then it is an intelligible problem (or just a problem). If the stimulus 
denotes a state of knowing to which no string of operators in J(n) can be 
applied, then the stimulus may either be “uninterpretable’’, at the nth trial, 
as a problem, or it may feature as an “unsolvable problem”. In the latter 
case, the model is unable to solve the problem at the nth trial but is able to 
extend I(m) by learning, in such a way that the problem could have been solved 
if the learning had already occurred. 

To illustrate these points in Fig. 14, is the hypothetical L° organization 
in a fully proficient subject. The nodes in this graph are states of knowing 
and the branches in this graph represent L° operators. If n = T is the trial at 
which the subject achieves a terminal criterion of performance, then Fig. 14 
represents I(T) (the Organization I(T) is unique for a given experiment; but, 
since learning may often occur in various ways all of which satisfy the terminal 
criterion, the experimenter can only predict a class of I(T)). At trials less 
than 7, the structure J(n) will be an incomplete fragment of I(T) (or of the 
class of I(T)) such as Fig. 15. 

It is possible to view J(n) as the L° structure which, at the nth trial, partially 
satisfies the L° goal of solving problems of a given class. States of knowing in 
the domain of I(n) are problems, those located along the strings of operators 
are partially solved problems. The production of a partially solved problem, 
given a problem, constitutes the achievement of a subgoal of the goal. 
Further, all problems and partially solved problems (all of the nodes in I(n)) 
can be assigned a “distance” from solution. Thus (Fig. 16) b is at a greater 
distance from solution than a. The state of knowing ¢ is not in the domain of 
I(n) but it is, given a suitable L' structure, an unsolvable problem. The state of 
knowing d might be uninterpretable. If we confine our attention to Fig. 16, 
this distance is unique. Given the L° structure in Fig. 15, this “distance” is 
not unique unless weintroduce a rule (which we shall do) constraining the model 
to apply the shortest applicable string of L° operators to an intelligible problem. 

We are now in a position to regard learning as reducible to hierarchically 
organized problem solving. Learning consists in the construction of I(m) by 
the application of L' operations which satisfy an L' goal of producing I(n) 
such that the L° goal may be satisfied. 
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Fic. 13. Flow charts for a single subskill version of the model (due to Feldmann, Lewis, 
Mallen and Pask) current in 1966-1968. 


(a) Stimuli x numbered as SN; 1-8, xeX,; 9-32 xEX2;33-64 xeX3; 64 and above xeX4. 
The programme decomposes the stimuli into their constituents by rewriting rules based on 
the description in Fig. 4. 

(b) Response components of y designated by KP. 

(c) Operators have domain and range; simple operators domain SN 1-8 ; complex operators, 
obtained via substitution, a multiple domain. 

(d) Range is member(s) of KP (since Q is one to one, any correct operation has domain of 
n elements). 

(e) Concatenation produces strings via process of string construction. 

(f) Domain of string: SN>8. 

(g) Range of string; set of operator names. 

(h) Operators and strings have lifespan (trials to entry of a deletion stack) and a utilization 
(their correct use record). 

(i) Various stacks are used in programme but two are distinguished as operator store and 
string store. 

Notes on the main variants 

(a) Predictive control of effort using estimate of AA*. 

(b) For double subskill (or interference task) model there are two stimuli x = x4, xz 
of which the control mechanism gives priority to x4. Other subskill (interfering task) is 
assigned value, like a procedure, and xg is selected for attention in the same way that 
procedure is selected. 


Chart (I). Overall organization (simulated control mechanism and model for subject). 
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(IIT) L Structure To be definite, we shall consider the Z' structure of a 
model (Fig. 13) simulated in my own laboratory. But I believe that most, if 
not all of the subsequent argument is valid for any active programmatic model 
(any organization model) for learning. 

The L* structure of the model of Fig. 13 consists in a goal specification, 


1(T)for I(T) for 
xe Uscp : Uy» Yaseo xe Use 
vom A 
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Fic. 14. From Fig. 4 (Section 1.8) each stimulus, x, gives rise to a state of knowing because 
it is a member of some set U. Sets U are shown here as nodes. A problem is posed by x, 
under rule © and is solved by applying a correct operation, a string of operations or a 
complex operation (in turn, yielding other states of knowing). Correct operations are 
shown here as directed lines, which lead to a unique and central node (the solution); their 
application produces components of a response, y. The description of the correct operation 
paths is 1(T). To simplify the diagram I(T) is shown for xeU scp (representing 7 = 4) for 
xeUusc (representing 7 = 3) for xeU4y (representing 4 = 2) and for xeU, (representing 
n= 1). 


an interpretative code,t a list of L° operations that may be embodied as 
L® operators and a set of “Z' operations”. The L° operation list may be 
incomplete at the outset. It prescribes the operations that carry nodal 


{ The interpretative code is partially redundant, because the L° operations are con- 
ceived as TOTE units able to recognize the objects upon which they operate (for example, to 
recognize stimuli as posing problems, or to recognize partially solved problems). These 
operations are already specified in the operation list. However, the model must also interpret 
stimuli that pose unsolvable problems as signs for L' operations of one sort or another. 
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elements (states of knowing) into other nodal elements (states of knowing, 
including solutions). Further additions to this list are made by describing 
the strings of operations already embodied in the form of L° operators 
(e.g. Fig. 14 or Fig. 15). 
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Fic. 15. Typical structures of J(7) shown on left for one successful mode (bias to concatenate 
operators) and on right for another mode (bias to substitute operators). In each case m2 m1 
and the sequence is associated, realistically, with increase in 4 as learning proceeds; with 
construction of some complex operators and deletion of others. The model does not (run 
normally) have the effort to use and apply the processes that muster four or even three 
operators. Further, unless redundant complex operators are deleted (shown in Bias to 
substitution case) the model does not have effort to embody essential operators. 


The first L' operation is APPLY the shortest applicable string of L° operators 
in I(m) to the problem denoted by a stimulus. Apply is only realized, if 
such a string of operators exists in J(n) (for example, for problem a (Fig. 16), 
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when it leads to the application of a single L° operator or for problem b, when 
it leads to the application of a string of operators. It is not real for d or for c). 
In so far as APPLY is real, it is given precedence and it leads to a solution of a 
problem and to the partial satisfaction of the goal. If the model is presented 
with a problem such as d or c, the model is unable to APPLY and such a prob- 
lem is interpreted as a sign invoking the L’ operations we describe below. 

The first of these is CONCATENATE. To CONCATENATE, the model searches 
the operation list for an L° operation which may be applied to the given 
problem and also adjoined to a string of operators in J(n) which terminates 
in a solution state. The gap across which the concatenation process is allowed 
to occur is restricted (essentially, by the length of list search and the form of 
resource allocation programme in the model). For the moment, we shall 
assume that the gap between b and c allows for concatenation although that 
between b and d is too great; we interpret this as a statement that the model 
would have a vanishingly small chance of randomly CONSTRUCTING an opera- 
tion which leads from d to b (and thus to a and the solution) within the 
interval allowed for solving a problem. 

CONCATENATE is always followed (in our learning model) by the L’ opera- 
tion EMBoDY the selected L° operation as an operator in Z(n). Further, 


a0 
TO, 
\ 
, 
¢ 


Fic. 16. States of knowing (arbitarily chosen ones) shown as nodes, a, b, c, d; Operators 

are shown as directed lines; a, 8. The dotted line indicates that concatenation onto end of 

string (a, B) is possible, thus interpreting c. But the chance of randomly constructing an 

operator to interpret d, though finite, is small. Double arrows on operator y indicate 
substitution process. ; 


EMBODY is always followed by APPLY and the persistence of the string of 
operators formed by CONCATENATE is indirectly dependent upon the receipt 
of a “knowledge of results” signal referring to the application of the con- 
catenated string of operators (upon the result of an external test). Thus 
concatenation is controlled by “knowledge of results” and because of this 
I(n) chiefly consists in strings of operators that lead to correct solutions. 
The next LZ! operation is SUBSTITUTE. The model performs this operation, if 
the resource allocation programme allows it to do so, whenever it has com- 
pleted application or has found that application and concatenation are 
impossible. Substitution consists in the placement (or embodiment in I(7)) 
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of an operator performing the same (or, in principle, a similar) job to an 
existing operator. In particular (1) substitution may be used to reproduce the 
members of J(n) and (2) substitution may lead to the embodiment of more 
efficient operators capable of replacing strings of operators. Of these functions 
(1) is hard to depict graphically since a branch of the graph is placed in 
parallel with an identical branch. (2) is exemplified in Fig. 16 by the placement 
of y in parallel with the string of operators, «, B (since APPLY selects the shortest 
applicable string of operators y will be used upon subsequent occasions and 
a, 8, will not be used. With many resource allocation programmes, this leads 
to the decay of, «, £). 

Next, there are Z' operations, akin to substitution, that carry out transfer 
of training; they take parts of I(m) and place them in similar positions. These 
transfer operations are important when the skill being learned leads to an 
I(n) that must be represented by several graphs, for example, if the model is 
used to represent the acquisition of a skill having several subskills, such as the 
skill discussed in Section 1.9. 

Finally, implicit in the SUBSTITUTE operation, there is an LZ’ operation 
DESCRIBE. Thus, in order to substitute, the model must know what operator 
or string of operators in Z(m) is similar to an operation on its operation list. 
To do so, it must describe the contents or some of the contents of J(m). The 
elements that are described are coded like members of the operation list, and 
their similarity with other members is determined. Hence DESCRIBE is an 
operation for adjoining members to the initial operation list (for example, the 
coded description of the string of operators «, 8, which turns out to be similar 
to y). The model’s descriptive format is a coding of Fig. 14. 

We comment that the derivation of y in Fig. 16 may either be conceived 
as the matching of an existing code “‘y” with the description “a, 6” or as the 
production of y from «, 8, by a rule applicable to descriptions of operator 
strings ostensibly defined by embodiment in J(m) (recall that most of the 
strings of operators in J(n) lead to correct solutions). 


(IV) Uncertainty We have argued that a distance from solution may be 
defined over the states of J(n). Further, given an interpretative code and 
problems to interpret, the distance is a constituent of a model’s view 
uncertainty measure, M, such as 


If the state of the model is in Z(m), then there is an interpretable problem 
and M = the length of the shortest applicable string of operators, 
i.e. the distance to solution. 

If the state of the model is a solution in J(n), then M = 0. 
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If the state of the model cannot be interpreted as a problem but if con- 
CATENATE is possible, then it is an unsolvable problem and M = M,> 
M for any state in J(n). 

If CONCATENATE is impossible then M = M,>M,. M, = Probability of 
Random CONSTRUCTION. 


The average uncertainty, M*(n) at the termination of the mth trial, is 
defined as the average over a given number of preceding trials of the 
instantaneous values of M weighted by the intervals over which they 
were maintained. The actual values of M*(n) depend, of course, upon the 
constant M, and the constant M,. But the immediately important points 
are 


(1) That learning is a process that tends to reduce the subjective un- 
certainty, M* (interpreted as an uncertainty, given a class of problems, 
regarding their solution) by applying and if necessary building strings 
of L° operators. 

(2) That if we choose a resource allocation programme which gives rise to 
an active programmatic model (an organization model that must 
learn), then there is an implicit restriction of the form M*(n) > Mypin>9; 
failing this there is “nothing to learn about”, and the absurdity of 
“having nothing to learn about” can only be resolved by allowing the 
model to learn about something the experimenter regards as irrelevant; 
in which case M* is undefined. 

(3) That there is a further limit of the form M,..>M*() (which 
represents the fact that the model can be presented with problems that 
are too difficult to be intelligible). The value of M,,,, depends upon the 
L' organization and, in particular, the concatenation process. 

(4) That in order to satisfy the condition M,,.,.>M*(1)>Mijin>0, (which 
is, in a sense, an indication of the healthy state of the model) it will 
usually be necessary to control the environment, for example, by 
selecting the level of difficulty of the problems that are posed as a 
function of some estimate of M*(n). 


2.6. LIMITATIONS UPON EMBODIMENT. MEMORY, ATTENTION AND 
AUTONOMY 
Let us return to the restrictions imposed upon the construction and 
application and retention processes. What are the physiological or psychologi- 
cal limitations that we need to image in a resource allocation programme? 
It is difficult to give a concise account of these limits because the same 
constraints are manifest differently in different situations. The best that can 
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be done briefly is to set out the type of limitation we have in mind (with the 
overall caveat that the same constraint is often operating upon apparently 
distinct parts of the mental machinery). 

(1) Viewed as a data processing control system, man acts as a finite 
capacity, goal directed, sampling mechanism, with variable para- 
meters. Some of the constraints responsible for the discontinuities 
in the sampling mechanism are perceptual and some of them reside in 
the hierarchically organized motor system which is so constructed 
that once the organism is committed to an action (leading to a subgoal) 
it remains committed to this action for a definite interval. In other 
words, the organization of the system is partially autonomous. 

(2) Viewed as a perceptual system man can only attend to one goal at once. 
The evidence required for or relevant to goal achievement is processed 
by active perceptual filters (sensory analysers in the sense of Sutherland, 
1964) which extract specific properties of the sensory input (as being 
relevant). Man is able to change his attention and consequently to 
reorganize his perceptual filters, some of which have adaptively 
variable parameters and which, like the motor apparatus, often appear 
to work as partially autonomous units. 

There is plenty of evidence that the filters themselves, once they are 
selected to define a field of attention or a definite interpretation of the 
environment, are hierarchically organized and for some purposes, it is 
useful to distinguish between a sequential and a parallel structure. This 
distinction has less value if (i) we conceive of all attention as goal 
directed and (ii) we comment that if an organism can coherently 
change its attention, the mechanism subserving a given field of attention 
or interpretation must be embedded in an essentially parallel system 
which orients the creature, surveys the goals to which it might be 
directed and surveys the sensory evidence relevant to these goals. 
Given (i), item (ii) does not contradict the dictum that an organism 
attends to one goal at once. The organism does contemplate or con- 
sider something wider than its field of attention but it does not im- 
mediately attend to it. (This is a little more than a quibble over the 
word “attend”. The organism may recall some of the data which it 
contemplates but does not immediately “attend” to.) For the present 
purpose, at any rate, these definitions are satisfactory. Although we 
must comprehend the possibility that our subject will change his 
attention the goal and the relevant interpretation of the input data are 
laid down by an L' code which is part of the experimental instruc- 
tions. 


A CYBERNETIC EXPERIMENTAL METHOD—ITS PHILOSOPHY 311 


(3) Viewed as a coding and abstracting register, man is equipped with a 
hierarchy of memory systems. First, there is a limited buffer store 
(immediate memory) with a definite capacity in terms of coded and 
abstracted ‘‘chunks” (Miller, 1956) of data (the span of attention or the 
span of apprehension). The buffer store is coupled to and interacts with 
an apparently indelible long term memory by a dynamic reorganizing 
system that is often called “intermediate memory” (probably akin to 
Feigenbaum & Simon’s, 1962, program EPAM). Some facets of 
intermediate memory are reflected in our learning model. The joint 
limits upon immediate and intermediate memory are chiefly imposed by 
the storage capacity for suitably coded data and by the amount of 
computing effort that is available. Overload, for example, leads to loss 
and interference whilst items of data that are not reorganized are prone 
to decay. 

(4) Viewed as an individual, man has a definite maximum rate for dealing 
with his environment. If this rate is exceeded, various sorts of overload 
take place. But man has an equally important minimum limit. The 
computing mechanism (which we mimic with our resource allocation 
programme) cannot be turned off. Unlike a computing machine, man 
must attend to something, and solve the problems that something re- 
levant presents (Pask, 1963a). If nothing is given by the experimenter he 
directs his attention to some part of the environment which the ex- 
perimenter may regard as irrelevant and imbues it with relevance to 
himself. Failing this, he may attend to the internal environment of his 
long-term memory. (If we regard man as an individual, we are presum- 
ably concerned with the buffer store which contains the data of 
awareness (in the sense that the man could report “J am aware of this 
at just this instant”). If so, the long-term memory is just as legitimate an 
‘internal’ environment as the external environment.) 

(5) Viewed as a learner, man has a limited capability for adaptation as 
well as data processing. He is usually pushed to this limit by the need to 
learn and abstract the mass of data he deals with. But it is equally true 
that man cannot avoid learning. This computing mechanism must 
learn just as it must attend} and in the absence of a coherent input it 

-teorganizes the data available in long-term memory. 

If an organization model is to represent an individual subject, then its 

resource allocation component must be chosen to reflect as many of these 


+ This is not surprising. In view of the model in Fig. 13 and the discussion that refers 
to it, “learning” is reduced to a hierarchically organized problem solving. Hence the 
contention that man must attend and solve problems implies that man must “learn” if he 
is given a chance to do so. 
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constraints and peculiarities as possible (and, in any case, as many as prove 
important in the conditions in which the model is applied). 

Hence, an organization model can be very elaborate indeed. Fortunately, 
if we confine our attention to experimental learning situations (those con- 
sidered in Section 1 and others with greater intellectual content) we can 
assume a straightforward motor organization and a field of attention or 
interpretation which is determined by an L’ interpretative code, derived from 
the experimental instructions, which we take to be embodied in long-term 
memory. Given these simplifying assumptions, we shall use a resource 
allocation programme which chiefly (though, from the original caveat, not 
wholly) reflects the constraints in (5) and (4). The overall organization model 
must also comprehend the possibility that the subject changes his attention or 
fails to receive sufficient relevant data. 


2.7. RESOURCE ALLOCATION 


The first resource to be allocated is the “memory” and the “‘machinery” 
in which operation strings are embodied. So far as the L’ operations are 
concerned, we shall assume a long-term memory in which the interpretative 
(and prescriptive) codes can be written and retained unchanged throughout 
the experiment. But there are obviously restrictions upon the embodiment 
of the L° operation strings and one way of representing these restrictions 
(which we have adopted in our own simulations) is to suppose that the codes 
or prescriptive statements for these strings and their constituent operations 
give rise to bits of physical computing machinery which decay in time. In so 
far as the machinery in which the L° operation strings are embodied does 
decay over time an existing string must be maintained or reproduced by the 
specifically directed expenditure of L1 operations. 

This view of mentation is admittedly drawn from the field of cellular 
biology. In that sense, it is idiosyncratic: But it does no harm, perhaps, to 
think of operations (in mentation) as loosely analogous to allosteric enzyme 
systems (in a cell); of strings of operators as loosely analogous to enzyme sys- 
tems organized on a surface or by mutual specificity; of problem solving 
transformations as loosely analogous to the catalytic transformation of 
metabolites. Pursuing this correspondence one stage further we suppose that 
the mental computing machinery must be maintained and reproduced by a 
feedback controlled and essentially symbolic process, akin to the repressor 
feedback controlled process, involving DNA loci, messenger RNA and 
ribosomal transducers that reconstructs the cellular enzymes out of amino 
acids and maintains the organization in a cell. To avoid confusion we should 
insist that this loose analogy has nothing whatever to do with the biochemistry 
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of brains, with the part played by messenger RNA in memory, or with other, 
equally interesting, issues. It is a correspondence between cellular organization 
and mental organization. That is all. 

The next resource is the application of operations; “effort” or possibly 
“work”, denoted J, which indicates the application of either L' or L° opera- 
tives. Whatever resource allocation programme is chosen there is bound to be 
some restriction upon the effort that can be expended or the work that can 
be done, on average, in a unit interval. This effort will be spent in solving 
problems (by applying L° operations), in constructing operation strings and 
in maintaining them against decay (by the application of L’ operations). 
Combining the idea of decay with the restriction upon effort or work we 
obtain a system in which resources must be allocated in a very definite fashion 
if the computing structure is to survive; generally, its survival depends upon 
the introduction of at least some “co-operative” or “superadditive” com- 
position rules whereby a more abstract structure is more economic to 
maintain. 


2.8. He aaa CLASS OF ORGANIZATION MODELS FOR LEARNING (Pask, 

The choice of a resource allocation programme imposes a limit upon the 
average number of L! or L° operations, that can be applied in a given interval, 
briefly upon the effort, 2, so that say, A,,>4. There will also be a minimum 
limit, 2>Apin, Which reflects the fact that operations must be applied to 
something, though not necessarily to the problems that are deemed to be 
relevant by the experimenter. 

Many programmatic models can be devised with resource allocation 
programmes that satisfy these conditions. For example the resource alloca- 
tion executive of the model in Fig. 13 satisfies: 

(1) the average value, 2* of 2 is constant, so that 

A* =>. Ao» Amax > 40 > Amini 
(2) the resource allocation programme assigns the effort (between different 
sorts of L* and L° operations) so that 
(i) at least some L’ operations are applied to construct or modify 
strings of embodied operations (hence, there is at least some 
learning) and, in addition, 
(ii) M* is constant so that 
M* => My, Minax> Mo > Mmin>9- 

If a model of this type is left in an uncontrolled environment, it will either 
behave as though it was capable of curiosity or it will, if slightly elaborated, 
tend to change its attention and to reinterpret its environment. Further, to 


314 G. PASK 


presuppose the argument, if we somehow insist that the model attends to 
our relevant stimuli and that it interprets these as denoting problems, then the 
model will function if, and only if, we adjust the properties of this environ- 
ment in such a way that (1) and (2), (i) and (ii), can be satisfied by the model. 
If we do not, the model will become unstable in the sense that it will no 
longer interpret the relevant stimuli as problems and will no longer be in- 
formationally coupled to the experimental environment (Fig. 17). 


100 100 
3850 3 50 
fe) °o 
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(a) Subject’s latency distribution (percent- (b) Percentage of operator/string types in J 
age of latency patterns, in a complex res-  (n) responsible for each latency pattern. 
ponse, that belong to categories labelled Same Q; 6 runs, biased to concatenate, 
Q,,a,....6,,6,....). with p = 0°6. 

Same Q; 8 subjects (bias to concatenate 

evidenced by serial response). p = 0°6. 


Fic. 17, Subject’s complex response latency distribution used as indicator of the operators 
(strings) he is applying (a) compared with the percentage of operators (strings) in I(m) that 
would give rise to these response forms (b). Over all subjects/model parameters employed 
there is a high rank correlation between these patterns (0-65). If subjects are separated into 
response types distinguished by ‘‘Bias to substitution” (Fig. 15) and “Bias to concatenation” 
(Fig. 15) the separate pattern correlations (0°85) are significant at the 0°19% level. 


3. Identification 


3.1. ANALOGY RELATIONS 


The “identification” of a model (or part of it) with salient properties of the 
subject is an analogy relation between the model and the subject. This analogy 
relation is stated and discussed (like the model itself) in terms of the descrip- 
tive metalanguage of 2.1. 

As a first step, we must consider the form of an analogy relation. For this 
purpose, consider the objects «, 8, A, B, in Fig. 18. The objects « and 8 
belong to a universe of discourse U, and the objects A and B belong to a 
different universe of discourse U,. Objects « and f are related by the dis- 
positional or systemic relation, f, and objects A and B by the same form of 
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relation, denoted g. Object « in U, is related to object A in U, and £ is related 
to B by a relation of relevance and similarity, R, which is called an analogy 
relation between U, and U3. R defines those analogical properties of A and B 
that are, to use the nomenclature of Hesse (1963), positive (or relevant) to the 
analogy, negative (or irrelevant) or neutral. A neutral analogical property has 
undetermined relevance though it may become relevant or become irrelevant 
as observation of the system proceeds (Pask, 19630). 

In the simplest interpretation of Fig. 22, there are no neutral analogical 
properties and the analogy can read “‘« is to Bas ais to b” in which case f and 
g are isomorphic (strictly R determines a one to one correspondence between 
the positive analogical properties of « and a and between those of f and b). 
We shall also accept as a simple analogy any order preserving correspondence 
(or homomorphism) between f and g. Later, we shall examine the much less 
tractable sort of analogy in which some neutral analogical properties are 


YU Us 





Fic. 18. Analogy relation. 


admitted into the relation. This is particularly likely to occur if the model 
contains an object language with descriptive and prescriptive capabilities akin 
to those of the metalanguage. In this case, the subject is provided with the 
logical machinery for building models on his own account. 

For the moment, however, we shall exclude this possibility and consider 
only simple analogies. Here, if « and f are parts, sets of properties, or sub- 
systems in a real system (the subject), and if a and b are parts, sets of pro- 
perties or subsystems in a model, then R (the relation between fin U, and 
g in U,) is the basic analogy of our discussion and the C.R. condition is that 
“g = R(f)” is an isomorphism or a homomorphism. Let us stress the fact 
that this definition depends upon the absence of any neutral analogical 
properties. 

Similarly the identification of measured variables is a statement that if 
states of the subject (the conjoint values of input, output and parametric 


316 G. PASK 


properties of the subject) are f related and if the sets of model states (desig- 
nated by X, Y and so on) are g related, then “g = R(f)” is, as above, an 
isomorphism or a homomorphism. Logically, this is the identification 
imaged in Fig. 6. 


3.2. FUNCTIONAL IDENTIFICATION 


In this case the C.R. condition for identification is maintained by the C.E.P. 
The whole apparatus of C.E.P. may now be viewed as a means whereby the 
experimenter, by dint of his own effort, ensures 


(i) that no analogical properties have undetermined relevance; 

(ii) that all positive analogical properties are contained in R and that no 
negative analogical properties are contained in R; 

(iii) that the constancy of all properties apart from R(x) or R(y) is preserved 
(and, in the systems we consider in a moment, apart from a few others) 

(iv) that X(g)Y is reproduced for a given experiment (in the systems 
considered in a moment a rather more elaborate model must be 
reproducible); and 

(v) that the constancy of all model variables apart from x, y, or the 
“few others’’ mentioned above is tested before the experiment. 


In order to compare the identification of a functional model and the 
identification of a programmatic model, we should stress that C.E.P. involves 
only the experimenter. He imposes conditions, makes tests and restricts the 
subject. There is no. sense in which the subject co-operates with the experi- 
menter in this process. 

We have already commented that an unmodified C.E.P. is inapplicable if 
the subject learns. But it is easy enough to construct a modified C.E.P, that is 
applicable to a passive functional learning model. 

To do so we define 


@) a family, F, of functions f; € F; 
(ii) a family G of functions g; € G; 
(iii) an order preserving correspondence between the index i and the 
index /; 
(iv) a reinforcement J = ® (x,y) where © is the payoff function in Fig. 10. 


The C.E.P. is now applied to secure F(R)G and i(R)j (though R will usually 
be a homomorphism). An observation of learning is an observation of a set 
of behaviours determined by f;, generated by the time variation of i, and 
corresponding to some g; (the g; are the “system trajectories” of control 
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engineering). The learning manifest in the behaviours generated by i variation 
is “goal directed adaptation” if there is a well defined and usually monotonic 
relation p between J or its time integral and i. 

A passive functional learning model is informative if the degree of the 
homomorphism R is low enough for us to distinguish the system trajectories 
g, as separable (and its predictive value roughly depends upon the degree of 
R). Finally, if the degree of R is low enough for us to separate the g, cor- 
responding to the f; then at least some approximation to » may be incor- 
porated into the model as a rule of learning (failing this, the learning model is 
vacuous). 

The modified C.E.P. is undoubtedly adequate for certain experiments in 
animal learning. The practicability of maintaining repeatable environmental 
conditions requires no comment (many elegant techniques have been worked 
out). So far as the internal parameters are concerned, it does seem possible to 
achieve an excellent compromise between unduly restricting the animal and 
giving it free rein. Rats and pigeons can be starved prior to the experiment in 
order to approximate a uniform level of the primary hunger drive which 
significantly controls the subject’s motivation if food is used as the reinfore- 
ing agent. Hence the i to j relationship can be controlled. Similarly it is not too 
difficult to ensure that animal subjects are familiar with the experimental 
equipment (so that relevant prior knowledge is substantially equated from 
subject to subject). 

However, it is very doubtful indeed whether the modified C.E.P. can 
maintain C.R. when the learning process is more aptly represented by an 
active functional model. We shall refrain from a discussion of this point, 
apart from the comment that if a creature exhibits curiosity or if it changes 
its attention (or, to add a further item to our list, if it develops, in the course 
of the experiment, essentially “linguistic” sign stimuli that act as “releasers”’) 
then some of the analogical properties have undetermined relevance. They 
become positive or negative analogical properties as the learning experiment 
proceeds. But this renders the whole basis for the C.E.P. precarious. 


3.3. PROGRAMMATIC IDENTIFICATION 

The relation R between a subject and a programmatic model is established 
in a somewhat different fashion. Before the experiment begins, we issue certain 
experimental instructions to the subject and obtain his agreement to abide by 
them so far as he is able. 

Forexperiments of the sort weexamined in section 1, these instructions specify 


(1) that stimuli are taken to denote problems and that response alternatives 
denote solutions; 
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(2) that certain signals, usually called ‘knowledge of results signals” 
denote the rectitude of the previous response. Notice that (given an 
agreed goal) a “knowledge of results signal’’ has some but not all of the 
properties of a symbolic reinforcing event (it is the partial symbolic 
analogue for the reinforcing signals delivered to a functionally modelled 
subject. But many other features of the experimental situation may 
serve as symbolic reinforcing events); 


(3) that a given class of solutions or of problem solving algorithms is 
acceptable and is able to secure “correct” (in the sense of 1.2 (vii), 
“Q satisfying”) solutions ; 


(4) that the overall and immediate goals are defined and accepted (the 
overall goal will, in the simplest case, be to learn to solve and to solve 
any member of a given class of problems); 


(5) that insofar as this is possible the subject will attend to the relevant 
events and that he will interpret them and act upon them in the agreed 
fashion. 


Logically, the issuing and acceptance of these instructions (i) establishes an 
analogy between an object language we.have conceived in our programmatic 
model and the object language used by the subject (ii) assures us that, so far 
as possible, the subject will use this object language legally and that he will 
aim for certain goals (the latter point can be usefully rephrased; the instruc- 
tions “‘read”’ the interpretative code of the programmatic model “into” the 
subject; their acceptance is taken to evidence the indelible ‘“‘writing” of this 
“code” in the subject’s long-term memory. We may check on the verity of 
this inference by testing the subject to be sure that he can recite the instruc- 
tions and that he has overlearned any procedural skills entailed by the inter- 
pretation of “problem” or “‘solution” or “‘stimulus”’ or “correct response”’). 
Together (i) and (ii) serve to establish the analogy R which identifies the 
programmatic model with the subject. 

Notice, however, that (unlike functional identification) programmatic 
identification depends upon the subject. The onus for maintaining R rests 
upon his shoulders and it is through his co-operation that we present stimuli 
on the understanding that they mean problems and accept responses as 
meaning solutions. All we have done is to class the subject as a creature 
capable of denoting and connoting and interpreting. 

Returning, for the moment, to the logic of the process, there are two 
alternative ways of saying what occurs when R is established by the experi- 
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mental instructions. These are 


(a) The instruction given constitutes discourse in the descriptive meta- 
language which (because he is a man) the subject understands as well 
as the experimenter. The initial discourse sets up all of the conditions 
listed as (1), (2), (3), (4) and (5) and because of this it also defines a 
simple object language capable of handling problems and their solu- 
tion. For the rest of the experiment, the subject is restricted to discourse 
in this simple object language. , 


(b) The instructions set up a stratified object language wherein the lowest 
level of discourse is a simple object language. However, the higher 
levels of discourse are able to handle statements about problems and 
solutions and, in particular, to express interpretative codes and goals. 
Having set up this stratified object language, the experimenter uses 
the higher levels of discourse to describe and obtain the subject’s 
acceptance of conditions (1), (2), (3), (4) and (5). At this point, he 
prohibits any further discourse at the higher levels and, as above, the 
subject is restricted to discourse in the simple object language for the 
main experiment. 


The distinction between (a) and (b) is not so esoteric as it seems to be. If, 
as stated, we can and do restrict the experimental discourse to the lowest level 
of the stratified object language, then it does not matter a great deal whether 
we take formulation (a) or (b). But this restriction may be undesirable or im- 
possible in some conditions. In particular formulation (6) is needed when we 
countenance higher level discourse in the conduct of the experiment if we are 
to maintain the distinction between a descriptive metalanguage (for talking 
about the experimental system) and an object language (for talking in this 
system). For this reason, we shall adopt (b) in favour of (a). 

Having adopted (b), we must specify the “levels of discourse” in the strati- 
fied object language. This is not too difficult because they are surrogates for 
the “levels of control” already introduced in 2.5 and we merely need to 
identify our object language L = L°, L',..., superscript terms designating 
levels of discourse, with the levels L°, L', of the previous discussion as in 
Fig. 19. 

Inso far as we may countenance discourse in L' as well as L° in the conduct 
of the experiment, we have to deal with two entirely different types of system 
These are 

Type a System. The experimenter establishes L. He establishes conditions 
(1), (2), (3), (4), (5) using L* in L whereby the subject agrees to maintain the 
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identification R. For the conduct of the experiment, he restricts the discourse 
to L°,-as above. 

Type b System. The experimenter establishes L. He establishes conditions 
(1), (2), (3), (4) using L' in L, whereby the subject agrees to maintain the 
identification R. But he allows the subject to use at least some L' expressions 
in the conduct of the experiment, over and above the L° discourse that is 
unreservedly permitted. 

If the subject can maintain R and if he agrees to do so, then a type a system 
restricts R to a simple analogy; for the subject is unable to externalize changes 
in relevance using L° and, by supposition, he maintains the identification 
which was originally agreed upon. In a type b system, this is no longer 
necessarily the case for L = L°, L', ... is able to convey statements that do 
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Fic. 19. Linguistic coupling between a pair of systems; either programmatic model for 

subject and for control mechanism (as in Fig. 13) or real subject and real control mechanism. 

An observer may, in either case, record the L° (lower level) or L' (higher level) dialogue on 
a commonly interpreted topic. Notation as in Fig. 12 and Fig. 13. 


change relevance and, in practice, it is extremely likely that the subject will 
make statements of this sort. Hence, a type b system is potentially open- 
ended in calibre and, in using it, we must consider the difficulties that occur 
when the analogy relation R contains neutral analogical properties as well as 
positive analogical properties. We shall examine these difficulties and propose 
methods for dealing with them in section 5 of the paper. 


3.4. THE REPRESENTATION OF LEARNING IN A “TYPE a” SYSTEM 


Confining our immediate attention to a type a system and assuming that 
the subject agrees to maintain R (without any neutral properties) can we repre- 
sent learning as goal directed adaptation within this programmatic framework ? 
(In other words, can we extend the concept of programmatic identification 
as we extended the concept of functional identification in order to compre- 


hend learning 7) 
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In principle, we may, provided that the learning process obeys the rules of a 
model of the type considered in 2.8; for the construction of strings of L° 
operators may be viewed, in a statistical sense, as an adaptive process. But if 
the process takes place in an organization model, there are still a number of 
problems about how the experiment should be carried out. The trouble is 
that certain instabilities are likely to develop if we merely present the subject 
with problems of uniform difficulty. 

These instabilities Type A and Type B are shown in Fig. 21. 

What happens if type A instability or type B instability occurs, as we have 
argued it must do when goal directed adaptation takes place in the subject? 
The subject neither suffers harm nor indulges in any curious antics; he simply 
changes his attention (or he reinterprets the experimental situation. We shall 
deal with this possibility in a moment.) 

But a shift of attention has the effect of changing R and, in general, of 
demolishing R (for the subject usually attends to data that are irrelevant 
to the experiment); so that, in the experimental context, it decouples the 
subject from the relevant environment. Notice that decoupling can occur 
because the relation R is an interpretative identification established by agree- 
ment with the subject and that it does occur when the experimental situation 
renders it impossible for the subject to abide by his agreement. An analogous 
phenomenon is manifest when the environment contravenes the interpretative 
code or the resource assignment routine of an organization model. 

To summarize, if R breaks down, so does the C.R. We have argued that the 
C.E.P. leads to a condition in which R does break down if goal directed 
adaptation takes place and consequently we infer that if the subject is neces- 
sarily represented by an organization learning model then the C.E.P. is 
inapplicable. 

This argument is open to the empirical criticism that subjects do continue 
to attend to uninteresting stimuli and that they continue to perform tasks 
which they have already learned. That is true. But, in doing this, they usually 
reinterpret the stimuli and the act of reinterpretation also breaks down the R 
that is assumed to exist. 

Reinterpretation is occasionally observable. People engaged upon tedious 
jobs and subjects carrying out overlearned tasks invent novel strategies even 
though a satisfactory strategy has been acquired. Often, the subject per- 
ceptually groups the stimuli so that he is able to learn. 

If these reinterpretative activities are neglected (whether or not they are 
observable) the experimenter seems to be looking at a sort of goal directed 
adaptation that is perversely modified by discontinuities. In fact he is proba- 
bly looking at a system wherein the goal directed adaptation is preceded by, 
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or mingled with, reinterpretative coding. A similar point of view is conveyed 
by Hunt Marin and Stone’s comment that perception must be controlled be- 
fore learning can be studied. 

To summarize the argument; with luck, the subject may, volitionally or 
not, succeed in reinterpreting the experimental situation so that goal directed 
adaptation is continually manifest. If so, we should recognize. that the onus 
for maintaining the experimental conditions rests with the subject. This is not 
simply a matter of maintaining agreement to adopt and to apply the inter- 
pretative code cited in the instructions; the subject bears the burden of carry- 
ing out a dynamic and fairly tiresome reorientation. On the other hand, the 
subject may not succeed in reinterpreting the experimental situation. If not, 
he changes his attention or opts out of the experimental situation, altogether. 
There is no logical objection to accepting this fact and aiming to study the 
learning process as consisting of (perceptual or other) reinterpretation and 
goal directed adaptation. The difficulty is that such an observation is hampered 
by the problem of undefined relevance cited at the end of 2.8 in connection 
with concept learning. 


3.5. ALTERNATIVE METHOD FOR MAINTAINING C.R. IN THE EXPERI- 
MENTAL SYSTEM 

Evidently, we need a method for maintaining the C.R.. condition (the 
identification, R) that takes the burden of reorganization and reorientation off 
the subject’s shoulders. In mechanical terms, the method should replace a 
reorienting automaton that sets up the conditions for goal directed adapta- 
tion (which the C.E.P. requires the subject to maintain in his brain) by 
an external surrogate. One suitable candidate for this experimental method 
is the stabilization technique discussed in 1.1 and fully exemplified in the rest 
of section 1. We shall argue that this technique constitutes a suitable method, 
as it stands, if the subject is characterized by an organization mode! that 
satisfies 2.8 (1) and 2.8 (2). There is no reason why similar techniques should 
not be employed if the subject dynamics are generated through a different 
choice of resource allocation programme. 

Machine B in Fig. 1 must contain a model for subject A. The cogency 
of this comment will be evident from a couple of prefatory remarks. 


(1) We need to interpret the external source of variety, C in Fig. 1, as a 
source of potential and relevant uncertainty to the subject, A. Hence, 
the model embedded in B must (explicitly or not) provide an image of 
“problem” and “uncertainty” (as seen by the subject). 

(2) We need to interpret the filtering or modulation of variety performed by 
B as the simplification of a problem. But, if so, B must specify what 


A CYBERNETIC EXPERIMENTAL METHOD—ITS PHILOSOPHY 323 


counts from the subject’s point of view, as the partial solution of a 
problem. Hence B must contain a model for problem solving that tallies 
with the way that A does solve problems, or, at any rate, with the way 
the experimental instructions allow A to solve problems. 


Of course, the model is still a descriptive and predictive structure used by the 
experimenter. But, in addition to this, it is employed within the system to 
control the experiment. 

To see the sense in which the status of the model is changed when it is 
employed as part of the experimental control mechanism, turn to Fig. 20. 
The experimenter has in mind a coupled system b =a, B which is 
based upon an organization model, «, for the subject and the design, 8, for 
his control mechanism. In the real world, there is a coupled system a = A, B 
where A is the subject and where B is the real control mechanism (but A is 
imaged as an organization model and B is designed with this model in mind). 
The system a is dependent upon an identification (established by the initial L* 
discourse) because of which the display and response arrangements constitute 
achannel of L° discourse and because of which the “Display” and “‘Response”’ 
boxes in Fig. 20 are insomorphic with the “identification”? boxes. In 
particular, if the experimenter in Fig. 20 uses a comparison between the 
predicted output and the experimental data to modify his sequence of opera- 
tions, he would be doing the same thing as B in Fig. 20. Thus a = A, B, is 
isomorphic with b = a, B. System B is an external surrogate for the experimen- 
ter operating in this mode, and there is an external identification I(n) R E() 
between the model J(m) in A and the “External Model” entailed by the sim- 
plification process (which we write E(n)) in B. 

The model of Section 2.5 (Fig. 13), with the resource allocation executive 
of Section 2.8 has a stable operating region; M,,,,>M*>My,;, (the condi- 
tion stated in Section 2.5(4) and repeated in Section 2.8(2) (ii) ). This model 
now serves as the subject model, « and it is run, as a computer simulation, in 
combination with a tutorial programme f (Fig. 20). 

Similarly, all of the real life man (A) machine (B) system of Section 1.8 
has a stable operating region; p;=>Po; Pmax> P> Pomin- 

By hypothesis, a = A, B is analogous to b = a, B. 

The crucial identification J(n) R E(n) rests upon a molar condition (at the 
macro-level of information theory or statistical learning theory) that the 
behaviour of M (in a) shall image the behaviour of p (in b) within and at the 
extremities of the operating region for which the identification is valid. 

The behaviour of M for a b = a, B simulation and of p for an a = A, B 
experiment is shown in Fig. 21 (behaviour within the operating region) and 
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Fig. 17 (behaviour at its extremities). The data is consistent. The form of 
Fig. 21 is unchanged by reasonable disturbances of b simulation parameters; 
the results in Fig. 17 are obtained for all of the tasks and conditions noted in 
Section 1. 

This a, b, correspondence was achieved by building 8 as a replica of B, 
which contains a model for E(n) based upon (cognitive or strategic) examina- 


Adjust design of. & (7) in @ and B to stabilize 
b interaction and to stabilize a interaction of 
same form. 


Observer or experimenter 





Fic. 20, Alternative method for maintaining analogy relation of CR (compare with Fig. 6). 

E(n) (in the control mechanism that regulates the model’s learning) is identical with E(n) 

(the simplification rule) that regulates the real subject’s learning in the corresponding 

experiment. If the real subject can be held in operating region and macro characteristics of 

dialogue are similar we infer that /() (structure built up in simulation model) is analogous 
to cognitive process (region marked by ?) in real subject. 


tion of subjects (A) run in B controlled experiments; 8, in turn controls the 
development of /(m) in a. Since B and f are isomorphic, it is argued that A and 
a are analagous (within the operating region). 

Conversely, the a, b correspondence could be extended to encompass a 
larger operating region; by varying «in the simulation, designing f to control 
« effectively and building B as a replica of 8. The cycle of approximation and 
extension can be iterated. I propose it as a design paradigm for tutorial 
machines (like B) and for psychological experiments in general. 
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4. Tutorial Conversations and Participant Interaction in Type a Systems 


Consider a tutorial conversation taking place between a real life student and 
his tutor. The discourse is conducted in an open-ended, unstratified natural 
language. However, there is a ready correspondence between some of the 
tutorial conversation and the L° discourse in the stratified object language of a 
teaching or experimental control system (or the entire discourse in a type a 
system). The tutor poses problems that are gleaned from a syllabus and the 
student responds as our subjects do; either student or subject is provided 
with (some form of) symbolic reinforcement. 

However, in a tutorial, the student is allowed to make certain L! state- 
ments; he may say that he prefers to learn in some different way, that he 
wishes to rehearse the solution of one class of problems or another, or that he 
wishes to adopt a different set of basic axioms. Of course, these are no more 
than preference statements, since the tutor may or may not allow the student 
to do these things. He accepts the student’s LZ* utterances conditionally. On 
the whole, the tutor gives the student his own way if, and only if, doing so leads 
to a tutorial goal (such as learning to solve and solving the given class or classes 
of problems). This type of Z' discourse (let us call it a discourse involving 
“metastatements”’ since the student and the tutor are talking “‘about’’ the 
process of tuition or the basis for tuition, or, in any case, about the object 
language discourse in L°) could be reproduced in a mechanized teaching 
system although it is not reproduced in the type a system. 

About 20 % of the subjects ina type asystem, more or less depending upon the 
experiment, do not conform to the simple mode of interaction we have so far 
indicated. Instead, they engage in a mode of interaction, “Participant Inter- 
action”, which may be construed as an “illegal”? process for achieving the L* 
discourse of a real life tutorial when L discourse is disallowed. The facts of 
Participant Interaction are quite straightforward. The subject makes se- 
quences of idiosyncratic responses, usually of a type that the control mechan- 
ism interprets as mistaken. The occurrence of these idiosyncratic response 
sequences correlates with changes in the experimental parameters, for exam- 
ple, with a reduction in the level of difficulty, mediated by the control mechan- 
ism and apparently sensed by the subject. Responses of this sort are illegal 
with reference to the rules of L° in so far as the subject is not “mistaken” 
because he is unable to solve a problem, but because a response interpreted 
as a “mistake” by the control mechanism will give rise to a change in the 
mode of instruction. 

Often enough the subject is unconscious of Participant Interaction when it 
occurs (though a few subjects are aware of it) and the incidence of Participant 
Interaction can be reduced or even suppressed by replacing deterministic 
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decision rules with probabilistic decision rules (so that the subject is unable 
to learn the rules of the control mechanism). In contrast, Participant Inter- 
action is encouraged by reducing the stringency of the learning system, for 
example, by increasing pp. __ 

However, if the incidence of Participant Interaction is greatly reduced, 
the coupled man-machine system becomes unstable. The reason for this 
appears to be that in the less stringent conditions a subject is presented with 
barely enough relevant uncertainty to occupy his interest and to keep his 
attention on the job. To compensate for this deficit, he engages in Participant 
Interaction much as a man doing a tedious task will invent novel strategies, 
even though his existing strategies are perfectly acceptable. In this sense 
Participant Interaction is used by the subject to prolong the stable regime. 
Noticeably, subjects who do indulge in Participant Interaction often con- 
tinue within the stable regime long after n = T, at which point the control 
mechanism cannot increase the relevant uncertainty of the task. 

It is perfectly possible to build control systems that legalize rather than 
inhibit Participant Interaction and which mimic the conditions for a tutorial 
conversation rather closely. These systems are called adaptive metasystems, for 
the control mechanism is able to accept and deal with LZ’ preference state- 
ments from the subject. We had several reasons for building adaptive meta- 
systems, namely 


(1) we wished to approximate the real life tutorial conversation more 
closely in order to achieve a more effective teaching system; 

(2) we wished to examine the stabilizing effect of Participant Interaction; 

(3) we aimed to study the L' discourse in a stable man—machine system and 

(4) we aimed to extend the present experimental method to studies of 
concept learning and innovation. 


The rest of this paper deals with the design and use of adaptive meta- 
systems. 


5. Adaptive Metasystems 


5.1. DIFFERENT FORMS OF ADAPTIVE METASYSTEMS 


All adaptive metasystems accommodate L' discourse in the conduct of an 
experiment. Hence, they are type b systems in the sense of Section 3. In some 
cases, the 1 discourse is restricted to achieve an information closure. If so, 
the type b system is degenerate. 

We shall first consider adaptive metasystems built according to the tutorial 
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paradigm of section 4. Most of these are degenerate (though they are not 
necessarily so). Next, we shall examine open types of adaptive metasystem. 
Finally, we discuss the use of open adaptive metasystems in man—machine 
co-operation (for example, in design) and experimental tools in the investiga- 
tion of insightful concept learning. 


5.2. SIMPLE METASYSTEM 


Consider the transformation rule alternation skill described in 1.9 and 
instructed by the hierarchically organized adaptive control mechanism in 
Fig. 5. The alternation strategy of the component M? in Fig. 5 leads to the 
selection, for each block? of trials, of a value of i(n) naming the transforma- 
tion rule that is pertinent throughout the nth block. 

This arrangement is supplemented in an adaptive metasystem. First, ‘the 
subject is provided with an L' display that represents the values of the n;. 
Next, he is given an L} display of the increment of the product of the n; p; 
(the estimated learning rate which will be designated AQ). Finally, he is given 
a set of L’ response buttons (one for each value of 7) that he can use, each 
block of trials, to express his preference for rehearsing any one of the alter- 
native subskills throughout the next trial block. 

Hence, at the end of the nth trial block, the adaptive metasystem is pre- 
sented with a pair of possibly conflicting L* statements, namely the choice of 
i(n+1) that is preferred by the subject and the selection of i(n+1) made by 
the higher level mechanism M7‘. If these statements do conflict, it is necessary 
to resolve the issue in favour of the subject or M?* and, in this case, we use 
the tutorial conversation’ paradigm to stipulate that if J(m) is.the machine 
selection at the nth trial and if P(m) is the subject preference 


i(n+1) = P(n) with probability C. A@ (n) 
i(n+1) = J(n) with probability 1—C . Aé (n) 


where C is chosen so that C. A 6,,.,— 1. 


5.3. ECONOMIC METASYSTEM 


It is possible to substitute a form of economic control for the probabilistic 
selection in 5.2. Briefly, we allow the subject to purchase control in L’ or 


{ The trial block may be a single trial or as many as eight trials, one for each of the 
stimuli x in the stimulus set. In most of the present experiment, we used blocks of four 
trials. 

{ This is a slightly simplified form of the real life selection process. In practice, it is 
necessary to ensure that each subskill is rehearsed upon the same occasions and for this 
purpose we introduce a stopping rule that prevents the subject sampling the same subskill 
upon more than a given number of successive occasions and which insists upon his re- 
hearsing each subskill upon at least some occasions. 
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at level L' to the extent that he desires to do so and can afford to do so. Further, 
we institute a procedure that equates the allowed rate of purchase and the 
estimate A@ 

The most straightforward scheme is directly derivable from the system in 
5.2. The L* statements, P(n), become unconditionally accepted operations 
in L' (in this case the unconditional selection of i(n+1) ). However, the subject 
is only allowed to perform an L‘ operation if a quantity called his bank 
balance (which is displayed on a meter) exceeds the fixed cost of an L 
operation (it is convenient to assume a fixed cost but, in principle, there is no 
reason why different costs should not be assigned to different operations). 
Hence, the bank balance quantity serves as “‘money”’ with which the subject 
is able to purchase control over the system and we equate it with a short- 
term average of A@ from which, whenever an L operation is performed, the 
cost of this Z' operation is subtracted The maximum rate of purchasing is 
thus tied to the average value of A@ as required. In practice, it is also neces- 
sary to provide an automatic locking device to prevent the subject trying to 
spend ‘‘money” when his bank balance is too low. 


5.4. MODIFIED ECONOMY 

There are, of course, various possible forms of economic control. One 
alternative control procedure which we have tried out in our experiments 
is an integral control with a variable cost function. The bank balance is 
made proportional to the short-term average value of 


6 (m) = I nm)-p(n). 


5.5. GENERAL FORM 

We have illustrated the Z’ output statements of an adaptive metasystem 
as choices of i(m). In general, though, the output statements may determine 
(or conditionally determine) the value of any L° parameter (for example, 
the value of po, the values of several po;, the amount of cueing information 
available in connection with a specific stimulus). 

This leads to adaptive metasystems of the type shown in Plate 6. The 
meaning of an L' operation may even be extended to cover specific testing 
procedures (the subject is able to test the values of variables in the experi- 
mental situation, at a definite cost per test). 


5.6. INFORMATION STATUS 


In adopting the broad connotation of “Z' operation” it is necessary to 
distinsuish hetween those cases in which the “Z* operations” alter the 
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“machine view” informational status of the task and those cases in which 
they do not. 

Clearly, if the subject is allowed to make tests, he is gaining information 
as a result of his L' operations and this information is charged at the test cost. 
Here the information status of the task is altered and the “money” expended 
is an index of the information that could have been gained. 
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Fic. 22. Ambiguous figures task. Figures with six relevant features appear against a random 

background and are classified by assigning actual values to each feature, via six key switches. 

For example, the feature Upper curvature has possible values; Right(+), Left(—) and 

none (null, a straight line). If cueing information is delivered in respect of a given feature 

the “writing” beam is “brightened” whilst that feature is (repetitively) inscribed on all 

occasions in ¢(”) subsequent to delivery; hence feature is made prominent against random 
background. 


The converse situation is exemplified rather than generally stated. 
We are currently performing experiments upon the classification of sequen- 
tially presented ambiguous figures. The subject is required to classify each 
figure by assigning values to up to six properties (using a mechanized 
response board). To aid him in this task, he is provided with cueing informa- 
tion that delineates subsets of paradigm figures in the display of Fig. 22. 

The subject is run in the “clamped” conditions of 1.5 (so that Az(m) 
estimates his decision time) and the cueing information is presented sequen- 
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tially, and is delayed along the variable “decision time co-ordinate” of length 
At(n) at the nth trial. The cueing process is illustrated by Fig. 23 wherein 
the main parameter is the area under the waveform which sequentially 
switches off the lamps in the cueing display (the subject makes a correct 
response in classifying for the ith property if and only if his classification is 
correct and the response denoting it occurs before the delivery of cueing 
information about the ith property). Specifically, it is assumed in Fig. 23 
(i) that cueing information determining the values of properties ab cde andf 
is delivered in alphabetic order, with the a value delivered first. 4 is modulated 
to maintain p = Po where p is an average taken over the subject’s perform- 
ance with respect to each of the properties in the classification task. 


Threshold search potential 
>”7ADaocTa 
Feature threshold 





~— At(n) —> Rest ~——At(n+l)—> Rest 


Fic. 23. Arrangement for delay of cueing information. A cue is given if and only if threshold 
for a feature is exceeded. Thresholds are permitted by subject’s rank ordering and, on any 
trial, some may not be exceeded. 


As a metasystem variable, we allow the subject to rank order the properties 
a bcdeand f so that he receives cueing information sequentially but in his 
preferred order (rather than an arbitrary, alphabetic order). His L* response 
buttons apply charges to capacitors associated with each of the property 
defining variables and he pays for this facility These charges leak away 
from the capacitors at a fixed rate so that in order to maintain a pattern, the 
subject must continue to pay for its reinstatement. However, the entire 
process is constrained so that the area under the curve in Fig. 23 is equal to 
1-747 so that, from the machine point of view,t the information status is 
unchanged by the L1 operation of imposing a rank ordering pattern although 
it is changed by the automatic adjustment of 4 to maintain p = pp. 

We may thus distinguish the case of purchasing tests, where the L* opera- 
tions give the subject objectively estimated information over and above the 

t The information status obviously is changed “from the subject’s point of view” in the 


sense that one rank ordering fits his decision process (his classifying decision tree) far 
better than another. 
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information he would otherwise receive and the case we have just described 
in which the subject purchases a patterning or coding of information (the 
(amount of which is automatically determined by the 4 adjustment). 


5.7. OPEN SYSTEMS 


The adaptive metasystems so far described are degenerate in the sense 
that the set of possible L1 statements is determined, at the outset, by the 
experimenter. It is true that the subject may receive more or less informa- 
tion according to the ZL statements he wishes to make and is able to make. 
But he cannot obtain any novel type of information. 

In contrast, an open metasystem contains an L' modality which is a Jan- 
guage in which the subject can describe L' or L° operations and request their 
external construction (on a par, by hypothesis, with their internal embodi- 
ment in J(n) ). He may, for example, introduce novel axioms; he may name 
(and define) types of test or of cueing information; he may prescribe external 
computations or operations that the machine should perform on his behalf. 
Within such a system it is possible to realize a genuinely co-operative man— 
machine interaction and to stimulate something akin to a real conversation. 
In principle, the open adaptive metasystem is entirely possible. Its practi- 
cality is largely undetermined. 

So far, we have done no more than show that an open adaptive metasystem 
will work if L* is a simple denotative language. The situation is a slight exten- 
sion of the experiment in 1.4 where the subject is run in conditions of con- 
stant perceptual ambiguity. In the metasystem, he is allowed to define and 
use classifying properties (over and above those prescribed at the outset 
by the experimenter) in so far as he classifies the input in an informative 
fashion. 


5.8. INSIGHTFUL CONCEPT LEARNING AND FORMS OF CREATIVE ACTIVITY 


We can study learning in so far as we can externalize the operations 
entailed by learning in a suitable mode of discourse. We might restate this 
contention in the manner proposed by Vygotsky (1964), ‘“‘that we can observe 
learning in so far as we can teach’’. 

Type a systems are suitable vehicles for the study of “‘goal directed adapta- 
tion” which is one sort of learning. But an investigation of concept learning 
(of the dynamic nature of concept learning) calls for a type b system and a 
mode of discourse that approximates the tutorial conversation. In practice, 
the type b system is realized as a (usually degenerate) adaptive metasystem 
and, in this framework, we are currently experimenting with simple concept 
learning. 
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Insightful concept learning can only be externalized in a non degenerate 
or open type of adaptive metasystem, for this is the least elaborate structure 
in which the subject can externalize his innovations, co-operate with the 
control mechanism in building them into coherent structures, and submit 
them to critical scrutiny. 
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